ECVA

ECCV Conference Papers

The DOI links will be inaccessible until released by Springer.

Is Retain Set All You Need in Machine Unlearning? Restoring Performance of Unlearned Models with Out-Of-Distribution Images
Jacopo Bonato*, Marco Cotogni, Luigi Sabetta*
[pdf]
[DOI]

Octopus: Embodied Vision-Language Programmer from Environmental Feedback
Jingkang Yang, Yuhao Dong, Shuai Liu, Bo Li, Ziyue Wang, ChenCheng Jiang, Haoran Tan, Jiamu Kang, Yuanhan Zhang, Kaiyang Zhou, Ziwei Liu*
[pdf]
[DOI]

FunQA: Towards Surprising Video Comprehension
Binzhu Xie, Sicheng Zhang, Zitang Zhou, Bo Li, Yuanhan Zhang, Jack Hessel, Jingkang Yang, Ziwei Liu*
[pdf]
[DOI]

4D Contrastive Superflows are Dense 3D Representation Learners
Xiang Xu*, Lingdong Kong, Hui Shuai, Wenwei Zhang, Liang Pan, Kai Chen, Ziwei Liu, Qingshan Liu*
[pdf]
[DOI]

ItTakesTwo: Leveraging Peer Representations for Semi-supervised LiDAR Semantic Segmentation
Yuyuan Liu*, Yuanhong Chen, Hu Wang, Vasileios Belagiannis, Ian Reid, Gustavo Carneiro
[pdf]
[DOI]

Ponymation: Learning Articulated 3D Animal Motions from Unlabeled Online Videos
Keqiang Sun, Dor Litvak, Yunzhi Zhang, Hongsheng Li, Jiajun Wu*, Shangzhe Wu*
[pdf]
[DOI]

Robust Fitting on a Gate Quantum Computer
Frances F Yang*, Michele Sasdelli, Tat-Jun Chin
[pdf]
[DOI]

H-V2X: A Large Scale Highway Dataset for BEV Perception
Chang Liu*, MingXu zhu, Cong Ma
[pdf]
[DOI]

Learning Camouflaged Object Detection from Noisy Pseudo Label
Jin Zhang*, Ruiheng Zhang*, Yanjiao Shi, Zhe Cao, Nian Liu, Fahad Shahbaz Khan
[pdf]
[DOI]

Weakly Supervised 3D Object Detection via Multi-Level Visual Guidance
Kuan-Chih Huang*, Yi-Hsuan Tsai, Ming-Hsuan Yang
[pdf]
[DOI]

Deblur e-NeRF: NeRF from Motion-Blurred Events under High-speed or Low-light Conditions
Weng Fei Low*, Gim Hee Lee
[pdf]
[DOI]

CLR-GAN: Improving GANs Stability and Quality via Consistent Latent Representation and Reconstruction
Shengke Sun, Ziqian Luan, Zhanshan Zhao*, Shijie Luo, Shuzhen Han*
[pdf]
[DOI]

Learn from the Learnt: Source-Free Active Domain Adaptation via Contrastive Sampling and Visual Persistence
Mengyao Lyu, Tianxiang Hao, Xinhao Xu, Hui Chen*, Zijia Lin, Jungong Han, Guiguang Ding*
[pdf]
[DOI]

PromptIQA: Boosting the Performance and Generalization for No-Reference Image Quality Assessment via Prompts
Zewen Chen, Haina Qin, Juan Wang, Chunfeng Yuan, Bing Li*, Weiming Hu, Leon Wang
[pdf]
[DOI]

Motion Mamba: Efficient and Long Sequence Motion Generation
Zeyu Zhang, Akide Liu, Ian Reid, RICHARD HARTLEY, Bohan Zhuang, Hao Tang*
[pdf]
[DOI]

Radiative Gaussian Splatting for Efficient X-ray Novel View Synthesis
Yuanhao Cai*, Yixun Liang, Jiahao Wang, Angtian Wang, Yulun Zhang, Xiaokang Yang, Zongwei Zhou, Alan Yuille
[pdf]
[DOI]

"Tracking Meets LoRA: Faster Training, Larger Model, Stronger Performance"
Liting Lin, Heng Fan, Zhipeng Zhang, Yaowei Wang*, Yong Xu, Haibin Ling*
[pdf]
[DOI]

A Direct Approach to Viewing Graph Solvability
Federica Arrigoni*, Andrea Fusiello, Tomas Pajdla
[pdf]
[DOI]

CoR-GS: Sparse-View 3D Gaussian Splatting via Co-Regularization
Jiawei Zhang, Jiahe Li, Xiaohan Yu, Lei Huang, Lin Gu, Jin Zheng*, Xiao Bai*
[pdf]
[DOI]

SeFlow: A Self-Supervised Scene Flow Method in Autonomous Driving
Qingwen Zhang*, Yi Yang, Peizheng Li, Olov Andersson, Patric Jensfelt
[pdf]
[DOI]

ZeST: Zero-Shot Material Transfer from a Single Image
Ta-Ying Cheng, Prafull Sharma, Andrew Markham, Niki Trigoni, Varun Jampani*
[pdf]
[DOI]

3D Congealing: 3D-Aware Image Alignment in the Wild
Yunzhi Zhang*, Zizhang Li, Amit Raj, Andreas Engelhardt, Yuanzhen Li, Tingbo Hou, Jiajun Wu, Varun Jampani
[pdf]
[DOI]

SMooDi: Stylized Motion Diffusion Model
Lei Zhong, Yiming Xie, Varun Jampani, Deqing Sun, Huaizu Jiang*
[pdf]
[DOI]

ZipLoRA: Any Subject in Any Style by Effectively Merging LoRAs
Viraj Shah, Nataniel Ruiz, Forrester Cole, Erika Lu, Svetlana Lazebnik, Yuanzhen Li, Varun Jampani*
[pdf]
[DOI]

SV3D: Novel Multi-view Synthesis and 3D Generation from a Single Image using Latent Video Diffusion
Vikram Voleti*, Chun-Han Yao, Mark Boss, Adam Letts, David Pankratz, Dmitrii Tochilkin, Christian Laforte, Robin Rombach, Varun Jampani*
[pdf]
[DOI]

WordRobe: Text-Guided Generation of Textured 3D Garments
Astitva Srivastava*, Pranav Manu, Amit Raj, Varun Jampani, Avinash Sharma
[pdf]
[DOI]

Learning to Generate Conditional Tri-plane for 3D-aware Expression Controllable Portrait Animation
Taekyung Ki*, Dongchan Min, Gyeongsu Chae*
[pdf]
[DOI]

SimPB: A Single Model for 2D and 3D Object Detection from Multiple Cameras
Yingqi Tang, Zhaotie Meng, Guoliang Chen, Erkang Cheng*
[pdf]
[DOI]

"EMDM: Efficient Motion Diffusion Model for Fast, High-Quality Human Motion Generation"
Wenyang Zhou, Zhiyang Dou*, Zeyu Cao, Zhouyingcheng Liao, Jingbo Wang, Wenjia Wang, Yuan Liu, Taku Komura, Wenping Wang, Lingjie Liu
[pdf]
[DOI]

Editable Image Elements for Controllable Synthesis
Jiteng Mu*, Michaël Gharbi, Richard Zhang, Eli Shechtman, Nuno Vasconcelos, Xiaolong Wang, Taesung Park*
[pdf]
[DOI]

Improving 2D Feature Representations by 3D-Aware Fine-Tuning
Yuanwen Yue*, Anurag Das, Francis Engelmann, Siyu Tang, Jan Eric Lenssen
[pdf]
[DOI]

Self-supervised Feature Adaptation for 3D Industrial Anomaly Detection
Yuanpeng Tu, Boshen Zhang, Liang Liu, YUXI LI, Jiangning Zhang, Yabiao Wang*, Chengjie Wang, cairong zhao*
[pdf]
[DOI]

PCF-Lift: Panoptic Lifting by Probabilistic Contrastive Fusion
Runsong Zhu*, Shi Qiu*, Qianyi Wu, Ka-Hei Hui, Pheng-Ann Heng, Chi-Wing Fu
[pdf]
[DOI]

SemGrasp: Semantic Grasp Generation via Language Aligned Discretization
Kailin Li*, Jingbo Wang, Lixin Yang, Cewu Lu*, Bo Dai
[pdf]
[DOI]

MANIKIN: Biomechanically Accurate Neural Inverse Kinematics for Human Motion Estimation
Jiaxi Jiang*, Paul Streli, Xuejing Luo, Christoph Gebhardt, Christian Holz
[pdf]
[DOI]

Simple Unsupervised Knowledge Distillation With Space Similarity
Aditya Singh*, Haohan Wang
[pdf]
[DOI]

DragAPart: Learning a Part-Level Motion Prior for Articulated Objects
Ruining Li*, Chuanxia Zheng, Christian Rupprecht, Andrea Vedaldi
[pdf]
[DOI]

Diffusion Bridges for 3D Point Cloud Denoising
Mathias Vogel Hüni, Keisuke Tateno, Marc Pollefeys, Federico Tombari, Marie-Julie Rakotosaona, Francis Engelmann*
[pdf]
[DOI]

Optimizing Illuminant Estimation in Dual-Exposure HDR Imaging
Mahmoud Afifi*, Zhenhua Hu, Liang Liang
[pdf]
[DOI]

BAM-DETR: Boundary-Aligned Moment Detection Transformer for Temporal Sentence Grounding in Videos
Pilhyeon Lee*, Hyeran Byun
[pdf]
[DOI]

MarineInst: A Foundation Model for Marine Image Analysis with Instance Visual Description
Ziqiang Zheng*, Yiwei Chen, Huimin Zeng, Tuan-Anh Vu, Binh-Son Hua, Sai-Kit Yeung
[pdf]
[DOI]

Superpixel-informed Implicit Neural Representation for Multi-Dimensional Data
Jia-Yi Li, Xi-Le Zhao*, Jian-Li Wang, Chao Wang, Min Wang
[pdf]
[DOI]

EgoPoser: Robust Real-Time Egocentric Pose Estimation from Sparse and Intermittent Observations Everywhere
Jiaxi Jiang*, Paul Streli, Manuel Meier, Christian Holz
[pdf]
[DOI]

Physics-Free Spectrally Multiplexed Photometric Stereo under Unknown Spectral Composition
Satoshi Ikehata*, Yuta Asano
[pdf]
[DOI]

SplatFields: Neural Gaussian Splats for Sparse 3D and 4D Reconstruction
Marko Mihajlovic*, Sergey Prokudin, Siyu Tang, Robert Maier, Federica Bogo, Tony Tung, Edmond Boyer
[pdf]
[DOI]

VFusion3D: Learning Scalable 3D Generative Models from Video Diffusion Models
Junlin Han*, Filippos Kokkinos, Philip Torr
[pdf]
[DOI]

Alignist: CAD-Informed Orientation Distribution Estimation by Fusing Shape and Correspondences
Shishir Reddy Vutukur*, Junwen Huang, Rasmus Laurvig Haugaard, Benjamin Busam, Tolga Birdal
[pdf]
[DOI]

Meta-Prompting for Automating Zero-shot Visual Recognition with LLMs
Muhammad Jehanzeb Mirza*, Leonid Karlinsky, Wei Lin, Sivan Doveh, Jakub Micorek, Mateusz Kozinski, Hilde Kuehne, Horst Possegger
[pdf]
[DOI]

Physics-Based Interaction with 3D Objects via Video Generation
Tianyuan Zhang*, Hong-Xing Yu, Rundi Wu, Brandon Y Feng, Changxi Zheng, Noah Snavely, Jiajun Wu, William T. Freeman
[pdf]
[DOI]

Reconstruction and Simulation of Elastic Objects with Spring-Mass 3D Gaussians
Licheng Zhong, Hong-Xing Yu, Jiajun Wu, Yunzhu Li*
[pdf]
[DOI]

Deep Patch Visual SLAM
Lahav Lipson*, Zachary Teed, Jia Deng
[pdf]
[DOI]

Surface Reconstruction for 3D Gaussian Splatting via Local Structural Hints
Qianyi Wu*, Jianmin Zheng, Jianfei Cai
[pdf]
[DOI]

HeadGaS: Real-Time Animatable Head Avatars via 3D Gaussian Splatting
Helisa Dhamo*, Yinyu Nie, Arthur Moreau, Jifei Song, Richard Shaw, Yiren Zhou, Eduardo Pérez-Pellitero*
[pdf]
[DOI]

LayeredFlow: A Real-World Benchmark for Non-Lambertian Multi-Layer Optical Flow
Hongyu Wen*, Erich Liang, Jia Deng
[pdf]
[DOI]

Learning 3D Geometry and Feature Consistent Gaussian Splatting for Object Removal
Yuxin Wang, Qianyi Wu, Guofeng Zhang, Dan Xu*
[pdf]
[DOI]

Motion-prior Contrast Maximization for Dense Continuous-Time Motion Estimation
Friedhelm Hamann*, Ziyun Wang, Ioannis Asmanis, Kenneth Chaney, Guillermo Gallego, Kostas Daniilidis
[pdf]
[DOI]

Efficient Few-Shot Action Recognition via Multi-Level Post-Reasoning
Cong Wu, Xiao-Jun Wu*, Linze Li, Tianyang Xu, Zhenhua Feng, Josef Kittler
[pdf]
[DOI]

Text2Place: Affordance-aware Text Guided Human Placement
Rishubh Parihar*, Harsh Gupta, Sachidanand VS, Venkatesh Babu RADHAKRISHNAN
[pdf]
[DOI]

OGNI-DC: Robust Depth Completion with Optimization-Guided Neural Iterations
Yiming Zuo*, Jia Deng
[pdf]
[DOI]

Zero-Shot Multi-Object Scene Completion
Shun Iwase*, Katherine Liu, Vitor Guizilini, Adrien Gaidon, Kris Kitani, Rareș A Ambruș, Sergey Zakharov
[pdf]
[DOI]

Beta-Tuned Timestep Diffusion Model
Tianyi Zheng*, Peng-Tao Jiang, Ben Wan, Hao Zhang, Jinwei Chen, Jia Wang*, Bo Li*
[pdf]
[DOI]

POA: Pre-training Once for Models of All Sizes
Yingying Zhang*, Xin Guo, Jiangwei Lao, Lei Yu, Lixiang Ru, Jian Wang, Guo Ye, HUIMEI HE, Jingdong Chen, Ming Yang*
[pdf]
[DOI]

Taming Latent Diffusion Model for Neural Radiance Field Inpainting
Chieh Hubert Lin*, Changil Kim, Jia-Bin Huang, Qinbo Li, Chih-Yao Ma, Johannes Kopf, Ming-Hsuan Yang, Hung-Yu Tseng
[pdf]
[DOI]

MapDistill: Boosting Efficient Camera-based HD Map Construction via Camera-LiDAR Fusion Model Distillation
Xiaoshuai Hao*, Ruikai Li, Hui Zhang, Rong Yin, Dingzhe Li, Sangil Jung, Seung-In Park, ByungIn Yoo, Haimei Zhao, Jing Zhang
[pdf]
[DOI]

"ByteEdit: Boost, Comply and Accelerate Generative Image Editing"
Yuxi Ren, Jie Wu*, Yanzuo Lu, Huafeng Kuang, Xin Xia, Xionghui Wang, Qianqian Wang, Yixing Zhu, Pan Xie, Shiyin Wang, Xuefeng Xiao, Yitong Wang, Min Zheng, Lean FU
[pdf]
[DOI]

ProDepth: Boosting Self-Supervised Multi-Frame Monocular Depth with Probabilistic Fusion
Sungmin Woo*, Wonjoon Lee, Woo Jin Kim, Dogyoon Lee, Sangyoun Lee*
[pdf]
[DOI]

High-Resolution and Few-shot View Synthesis from Asymmetric Dual-lens Inputs
Ruikang Xu, Mingde Yao, Yue Li, Yueyi Zhang, Zhiwei Xiong*
[pdf]
[DOI]

Accelerating Image Super-Resolution Networks with Pixel-Level Classification
Jinho Jeong, Jinwoo Kim, Younghyun Jo, Seon Joo Kim*
[pdf]
[DOI]

LASS3D: Language-Assisted Semi-Supervised 3D Semantic Segmentation with Progressive Unreliable Data Exploitation
Jianan Li*, Qiulei Dong*
[pdf]
[DOI]

Contourlet Residual for Prompt Learning Enhanced Infrared Image Super-Resolution
Xingyuan Li, Jinyuan Liu*, ZHIXIN CHEN, Yang Zou, Long Ma, Xin Fan, Risheng Liu
[pdf]
[DOI]

Click-Gaussian: Interactive Segmentation to Any 3D Gaussians
Seokhun Choi, Hyeonseop Song, Jaechul Kim, Taehyeong Kim*, Hoseok Do*
[pdf]
[DOI]

Random Walk on Pixel Manifolds for Anomaly Segmentation of Complex Driving Scenes
Zelong Zeng*, Kaname Tomite
[pdf]
[DOI]

DySeT: a Dynamic Masked Self-distillation Approach for Robust Trajectory Prediction
Mozghan Pourkeshavarz*, Arielle Zhang, Amir Rasouli
[pdf]
[DOI]

Track Everything Everywhere Fast and Robustly
Yunzhou Song, Jiahui Lei*, Ziyun Wang, Lingjie Liu, Kostas Daniilidis
[pdf]
[DOI]

Towards Open-ended Visual Quality Comparison
Haoning Wu, Hanwei Zhu, Zicheng Zhang, Erli Zhang, Chaofeng Chen, Liang Liao, Chunyi Li, Annan Wang, Wenxiu Sun, Qiong Yan, Xiaohong Liu, Guangtao Zhai, Shiqi Wang, Weisi Lin*
[pdf]
[DOI]

FreeInit: Bridging Initialization Gap in Video Diffusion Models
Tianxing Wu*, Chenyang Si, Yuming Jiang, Ziqi Huang, Ziwei Liu
[pdf]
[DOI]

DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs
DongHyun Kim, Byeongho Heo, Dongyoon Han*
[pdf]
[DOI]

Eliminating Feature Ambiguity for Few-Shot Segmentation
Qianxiong Xu*, Guosheng Lin, Chen Change Loy, Cheng Long, Ziyue Li, Rui Zhao
[pdf]
[DOI]

Soft Prompt Generation for Domain Generalization
Shuanghao Bai*, Yuedi Zhang, Wanqi Zhou, Zhirong Luan, Badong Chen*
[pdf]
[DOI]

Shedding More Light on Robust Classifiers under the lens of Energy-based Models
Mujtaba Hussain Mirza*, Maria Rosaria Briglia*, Senad Beadini*, Iacopo Masi*
[pdf]
[DOI]

LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation
Jiaxiang Tang*, Zhaoxi Chen, Xiaokang Chen, Tengfei Wang, Gang Zeng, Ziwei Liu
[pdf]
[DOI]

Mahalanobis Distance-based Multi-view Optimal Transport for Multi-view Crowd Localization
Qi Zhang, Kaiyi Zhang, Antoni B. Chan, Hui Huang*
[pdf]
[DOI]

RAW-Adapter: Adapting Pretrained Visual Model to Camera RAW Images
Ziteng Cui*, Tatsuya Harada
[pdf]
[DOI]

SLEDGE: Synthesizing Driving Environments with Generative Models and Rule-Based Traffic
Kashyap Chitta*, Daniel Dauner, Andreas Geiger
[pdf]
[DOI]

AFreeCA: Annotation-Free Counting for All
Adriano D'Alessandro*, Ali Mahdavi-Amiri, Ghassan Hamarneh
[pdf]
[DOI]

Adversarially Robust Distillation by Reducing the Student-Teacher Variance Gap
Junhao Dong, Piotr Koniusz*, Junxi Chen, Yew-Soon Ong*
[pdf]
[DOI]

LN3Diff: Scalable Latent Neural Fields Diffusion for Speedy 3D Generation
Yushi Lan, Fangzhou Hong, Shuai Yang, Shangchen Zhou, Xuyi Meng, Bo Dai, Xingang Pan, Chen Change Loy*
[pdf]
[DOI]

Hierarchical Temporal Context Learning for Camera-based Semantic Scene Completion
Bohan Li*, Jiajun Deng, Wenyao Zhang, Zhujin Liang, Dalong Du, Xin Jin, Wenjun Zeng
[pdf]
[DOI]

Equi-GSPR: Equivariant SE(3) Graph Network Model for Sparse Point Cloud Registration
Xueyang Kang*, Zhaoliang Luan, Kourosh Khoshelham, Bing WANG*
[pdf]
[DOI]

GTP-4o: Modality-prompted Heterogeneous Graph Learning for Omni-modal Biomedical Representation
Chenxin Li*, Xinyu Liu, Cheng Wang, Yifan Liu, Weihao Yu, Jing Shao, Yixuan Yuan
[pdf]
[DOI]

PromptCCD: Learning Gaussian Mixture Prompt Pool for Continual Category Discovery
Fernando Julio Cendra, Bingchen Zhao, Kai Han*
[pdf]
[DOI]

Sapiens: Foundation for Human Vision Models
Rawal Khirodkar*, Timur Bagautdinov, Julieta Martinez, Zhaoen Su, Austin T James, Peter Selednik, Stuart Anderson, Shunsuke Saito
[pdf]
[DOI]

Linearly Controllable GAN: Unsupervised Feature Categorization and Decomposition for Image Generation and Manipulation
sehyung lee*, Mijung Kim, Yeongnam Chae, Bjorn Stenger
[pdf]
[DOI]

Generating Human Interaction Motions in Scenes with Text Control
Hongwei Yi*, Justus Thies, Michael J. Black, Xue Bin Peng, Davis Rempe*
[pdf]
[DOI]

NOVUM: Neural Object Volumes for Robust Object Classification
Artur Jesslen*, Guofeng Zhang, Angtian Wang, Wufei Ma, Alan Yuille, Adam Kortylewski
[pdf]
[DOI]

Align before Collaborate: Mitigating Feature Misalignment for Robust Multi-Agent Perception
Dingkang Yang, Dingkang Yang, Ke Li, Dongling Xiao, Zedian Shao, Peng Sun, Liang Song*
[pdf]
[DOI]

HIMO: A New Benchmark for Full-Body Human Interacting with Multiple Objects
Xintao Lv, Liang Xu, Yichao Yan*, Xin Jin, Congsheng Xu, Wu Shuwen, Yifan Liu, Lincheng Li, Mengxiao Bi, Wenjun Zeng, Xiaokang Yang
[pdf]
[DOI]

SAIR: Learning Semantic-aware Implicit Representation
Canyu Zhang*, Xiaoguang Li*, Qing Guo*, Song Wang*
[pdf]
[DOI]

ColorMNet: A Memory-based Deep Spatial-Temporal Feature Propagation Network for Video Colorization
Yixin Yang, Jiangxin Dong, Jinhui Tang, Jinshan Pan*
[pdf]
[DOI]

UNIC: Universal Classification Models via Multi-teacher Distillation
Yannis Kalantidis, Diane Larlus, Mert Bulent Sariyildiz*, Philippe Weinzaepfel, Thomas LUCAS
[pdf]
[DOI]

Instance-dependent Noisy-label Learning with Graphical Model Based Noise-rate Estimation
Arpit Garg*, Cuong Cao Nguyen, RAFAEL FELIX, Thanh-Toan Do, Gustavo Carneiro
[pdf]
[DOI]

Eliminating Warping Shakes for Unsupervised Online Video Stitching
Lang Nie, Chunyu Lin*, Kang Liao, Yun Zhang, Shuaicheng Liu, Rui Ai, Yao Zhao
[pdf]
[DOI]

Vary: Scaling up the Vision Vocabulary for Large Vision-Language Models
Haoran Wei*, Lingyu Kong, Jinyue Chen, Liang Zhao, Zheng Ge, Jinrong Yang, Jianjian Sun, Chunrui Han, Xiangyu Zhang
[pdf]
[DOI]

Merlin: Empowering Multimodal LLMs with Foresight Minds
En Yu, Liang Zhao, YANA WEI, Jinrong Yang, Dongming Wu, Lingyu Kong, Haoran Wei, Tiancai Wang, Zheng Ge, Xiangyu Zhang, Wenbing Tao*
[pdf]
[DOI]

ViC-MAE: Self-Supervised Representation Learning from Images and Video with Contrastive Masked Autoencoders
Jefferson Hernandez*, Ruben Villegas, Vicente Ordonez
[pdf]
[DOI]

E.T. the Exceptional Trajectory: Text-to-camera-trajectory generation with character awareness
Robin Courant*, Nicolas Dufour, Xi WANG, Marc Christie, Vicky Kalogeiton
[pdf]
[DOI]

OphNet: A Large-Scale Video Benchmark for Ophthalmic Surgical Workflow Understanding
Ming Hu*, Peng Xia, Lin Wang, Siyuan Yan, Feilong Tang, zhongxing xu, Yimin Luo, Kaimin Song, Jurgen Leitner, Xuelian Cheng, Jun Cheng, Chi Liu, Kaijing Zhou*, Zongyuan Ge*
[pdf]
[DOI]

SignAvatars: A Large-scale 3D Sign Language Holistic Motion Dataset and Benchmark
Zhengdi Yu, Shaoli Huang*, yongkang cheng, Tolga Birdal
[pdf]
[DOI]

AttnZero: Efficient Attention Discovery for Vision Transformers
Lujun Li, Zimian Wei*, Peijie Dong, Wenhan Luo, Wei Xue, Qifeng Liu*, Yike Guo*
[pdf]
[DOI]

Auto-GAS: Automated Proxy Discovery for Training-free Generative Architecture Search
Lujun Li, Haosen Sun, Shiwen Li, Peijie Dong, Wenhan Luo, Wei Xue, Qifeng Liu*, Yike Guo*
[pdf]
[DOI]

Auto-DAS: Automated Proxy Discovery for Training-free Distillation-aware Architecture Search
Haosen Sun, Lujun Li*, Peijie Dong, Zimian Wei, Shitong Shao
[pdf]
[DOI]

UniDream: Unifying Diffusion Priors for Relightable Text-to-3D Generation
Zexiang Liu, Yangguang Li, Youtian Lin, Xin Yu, Sida Peng, Yan-Pei Cao, Xiaojuan Qi, Xiaoshui Huang, Ding Liang*, Wanli Ouyang
[pdf]
[DOI]

TimeCraft: Navigate Weakly-Supervised Temporal Grounded Video Question Answering via Bi-directional Reasoning
Huabin Liu, Xiao Ma, Cheng Zhong, Yang Zhang, Weiyao Lin*
[pdf]
[DOI]

Spectral Subsurface Scattering for Material Classification
Haejoon Lee*, Aswin Sankaranarayanan
[pdf]
[DOI]

nuCraft: Crafting High Resolution 3D Semantic Occupancy for Unified 3D Scene Understanding
Benjin Zhu*, zhe wang, Hongsheng Li*
[pdf]
[DOI]

Dynamic Neural Radiance Field From Defocused Monocular Video
Xianrui Luo, Huiqiang Sun, Juewen Peng, Zhiguo Cao*
[pdf]
[DOI]

PiTe: Pixel-Temporal Alignment for Large Video-Language Model
Yang Liu*, Pengxiang Ding, Siteng Huang, Min Zhang, Han Zhao, Donglin Wang
[pdf]
[DOI]

CarFormer: Self-Driving with Learned Object-Centric Representations
Shadi Hamdan*, Fatma Guney
[pdf]
[DOI]

FreeDiff: Progressive Frequency Truncation for Image Editing with Diffusion Models
Wei WU*, Qingnan Fan, Shuai Qin, Hong Gu, Ruoyu Zhao, Antoni Chan*
[pdf]
[DOI]

Plain-Det: A Plain Multi-Dataset Object Detector
Cheng Shi, Yuchen Zhu, Sibei Yang*
[pdf]
[DOI]

Alternate Diverse Teaching for Semi-supervised Medical Image Segmentation
Zhen Zhao*, Zicheng Wang, Dian Yu, Longyue Wang*, Yixuan Yuan, Luping Zhou
[pdf]
[DOI]

Cs2K: Class-specific and Class-shared Knowledge Guidance for Incremental Semantic Segmentation
Wei Cong*, Yang Cong, Yuyang Liu, Gan Sun
[pdf]
[DOI]

Synchronous Diffusion for Unsupervised Smooth Non-Rigid 3D Shape Matching
Dongliang Cao*, Zorah Laehner, Florian Bernard
[pdf]
[DOI]

Text-Guided Video Masked Autoencoder
David Fan*, Jue Wang, Shuai Liao, Zhikang Zhang, Vimal Bhat, Xinyu Li
[pdf]
[DOI]

Diffusion Models for Open-Vocabulary Segmentation
Laurynas Karazija*, Iro Laina, Andrea Vedaldi, Christian Rupprecht
[pdf]
[DOI]

Textual-Visual Logic Challenge: Understanding and Reasoning in Text-to-Image Generation
Peixi Xiong*, Michael A Kozuch, Nilesh Jain
[pdf]
[DOI]

EvSign: Sign Language Recognition and Translation with Streaming Events
Pengyu Zhang*, Hao Yin, Zeren Wang, Wenyue Chen, Sheng Ming Li, Dong Wang, Huchuan Lu, Xu Jia
[pdf]
[DOI]

QUAR-VLA: Vision-Language-Action Model for Quadruped Robots
Pengxiang Ding, Han Zhao, Wenjie Zhang, Wenxuan Song, Min Zhang, Siteng Huang, Ningxi Yang, Donglin Wang*
[pdf]
[DOI]

Zero-shot Object Counting with Good Exemplars
Huilin Zhu, Jingling Yuan, Zhengwei Yang, Yu Guo, Xian Zhong*, Zheng Wang, Shengfeng He*
[pdf]
[DOI]

TextDiffuser-2: Unleashing the Power of Language Models for Text Rendering
Jingye Chen*, Yupan Huang, Tengchao Lv, Lei Cui, Qifeng Chen, Furu Wei
[pdf]
[DOI]

SFPNet: Sparse Focal Point Network for Semantic Segmentation on General LiDAR Point Clouds
Yanbo Wang*, Wentao Zhao, Cao Chuan, Tianchen Deng, Jingchuan Wang, Weidong Chen*
[pdf]
[DOI]

PartSTAD: 2D-to-3D Part Segmentation Task Adaptation
Hyunjin Kim, Minhyuk Sung*
[pdf]
[DOI]

FutureDepth: Learning to Predict the Future Improves Video Depth Estimation
Rajeev Yasarla*, Manish Kumar Singh, Hong Cai, Yunxiao Shi, Jisoo Jeong, Yinhao Zhu, Shizhong Han, Risheek Garrepalli, Fatih Porikli
[pdf]
[DOI]

LLM as Copilot for Coarse-grained Vision-and-Language Navigation
Yanyuan Qiao*, Qianyi Liu, Jiajun Liu, Jing Liu, Qi Wu
[pdf]
[DOI]

Raindrop Clarity: A Dual-Focused Dataset for Day and Night Raindrop Removal
Yeying Jin*, Xin Li, Jiadong Wang, Yan Zhan, Malu Zhang*
[pdf]
[DOI]

Unsupervised Moving Object Segmentation with Atmospheric Turbulence
Dehao Qin*, Ripon k Saha, Woojeh Chung, Suren Jayasuriya, Jinwei Ye, Nianyi Li
[pdf]
[DOI]

AccDiffusion: An Accurate Method for Higher-Resolution Image Generation
Zhihang Lin, Mingbao Lin, Meng Zhao, Rongrong Ji*
[pdf]
[DOI]

Uncertainty-Driven Spectral Compressive Imaging with Spatial-Frequency Transformer
Lintao Peng, Siyu Xie, Liheng Bian*
[pdf]
[DOI]

CaesarNeRF: Calibrated Semantic Representation for Few-Shot Generalizable Neural Rendering
Haidong Zhu, Tianyu Ding*, Tianyi Chen, Ilya Zharkov, Ram Nevatia, Luming Liang
[pdf]
[DOI]

MapTracker: Tracking with Strided Memory Fusion for Consistent Vector HD Mapping
Jiacheng Chen*, Yuefan Wu, Jiaqi Tan, Hang Ma, Yasutaka Furukawa*
[pdf]
[DOI]

Image Demoireing in RAW and sRGB Domains
Shuning Xu, Binbin Song, Xiangyu Chen, Xina Liu, Jiantao Zhou*
[pdf]
[DOI]

LiDAR-Event Stereo Fusion with Hallucinations
Luca Bartolomei*, Matteo Poggi, Andrea Conti, Stefano Mattoccia*
[pdf]
[DOI]

X-Former: Unifying Contrastive and Reconstruction Learning for MLLMs
Sirnam Swetha*, Jinyu Yang, Tal Neiman, Mamshad Nayeem Rizve, Son Tran, Benjamin Yao, Trishul A Chilimbi, Mubarak Shah
[pdf]
[DOI]

Learning Anomalies with Normality Prior for Unsupervised Video Anomaly Detection
Haoyue Shi, Le Wang*, Sanping Zhou, Gang Hua, Wei Tang
[pdf]
[DOI]

Revisiting Supervision for Continual Representation Learning
Daniel Marczak*, Sebastian Cygert*, Tomasz Trzcinski*, Bartlomiej Twardowski*
[pdf]
[DOI]

FLAT: Flux-aware Imperceptible Adversarial Attacks on 3D Point Clouds
Keke Tang, Lujie Huang, Weilong Peng*, Daizong Liu, Xiaofei Wang, Yang Ma, Ligang Liu, Zhihong Tian
[pdf]
[DOI]

MMBENCH: Is Your Multi-Modal Model an All-around Player?
Yuan Liu*, Haodong Duan*, Yuanhan Zhang, Bo Li, Songyang Zhang, Wangbo Zhao, Yike Yuan, Jiaqi Wang, Conghui He, Ziwei Liu, Kai Chen, Dahua Lin
[pdf]
[DOI]

Implicit Filtering for Learning Neural Signed Distance Functions from 3D Point Clouds
Shengtao Li*, Ge Gao, Yudong Liu, Ming Gu, Yu-Shen Liu
[pdf]
[DOI]

Unsupervised Exposure Correction
Ruodai Cui*, Li Niu, Guosheng Hu
[pdf]
[DOI]

Anytime Continual Learning for Open Vocabulary Classification
Zhen Zhu*, Yiming Gong, Derek Hoiem*
[pdf]
[DOI]

External Knowledge Enhanced 3D Scene Generation from Sketch
Zijie Wu, Mingtao Feng*, Yaonan Wang, He Xie, Weisheng Dong, Bo Miao, Ajmal Mian
[pdf]
[DOI]

G3R: Gradient Guided Generalizable Reconstruction
Yun Chen*, Jingkang Wang, Ze Yang, Sivabalan Manivasagam*, Raquel Urtasun*
[pdf]
[DOI]

DreamScene360: Unconstrained Text-to-3D Scene Generation with Panoramic Gaussian Splatting
Shijie Zhou*, Zhiwen Fan, Dejia Xu, Haoran Chang, Pradyumna Chari, Tejas K Bharadwaj, Suya You, Zhangyang Wang, Achuta Kadambi
[pdf]
[DOI]

Frequency-Spatial Entanglement Learning for Camouflaged Object Detection
Yanguang Sun, Chunyan Xu, Jian Yang, Hanyu Xuan*, Lei Luo*
[pdf]
[DOI]

VisionTrap: Vision-Augmented Trajectory Prediction Guided by Textual Descriptions
Seokha Moon, Hyun Woo, Hongbeen Park, Haeji Jung, Reza Mahjourian, Hyung-gun Chi, Hyerin Lim, Sangpil Kim, Jinkyu Kim*
[pdf]
[DOI]

Occluded Gait Recognition with Mixture of Experts: An Action Detection Perspective
Panjian Huang, Yunjie Peng, Saihui Hou*, Chunshui Cao, Xu Liu, Zhiqiang He, Yongzhen Huang*
[pdf]
[DOI]

EDTalk: Efficient Disentanglement for Emotional Talking Head Synthesis
Shuai Tan*, Bin Ji, Mengxiao Bi, ye pan*
[pdf]
[DOI]

Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models
Chuofan Ma*, Yi Jiang*, Jiannan Wu, Zehuan Yuan, Xiaojuan Qi*
[pdf]
[DOI]

On the Utility of 3D Hand Poses for Action Recognition
Md Salman Shamil*, Dibyadip Chatterjee, Fadime Sener, Shugao Ma, Angela Yao*
[pdf]
[DOI]

DG-PIC: Domain Generalized Point-In-Context Learning for Point Cloud Understanding
Jincen Jiang, Qianyu Zhou, Yuhang Li, Xuequan Lu*, Meili Wang*, Lizhuang Ma, Jian Chang, Jian Jun Zhang
[pdf]
[DOI]

Operational Open-Set Recognition and PostMax Refinement
Steve Cruz*, Ryan Rabinowitz, Manuel Günther, Terrance E. Boult
[pdf]
[DOI]

ScaleDreamer: Scalable Text-to-3D Synthesis with Asynchronous Score Distillation
Zhiyuan Ma*, Yuxiang Wei, Yabin Zhang, Xiangyu Zhu, Zhen Lei, Lei Zhang
[pdf]
[DOI]

SINDER: Repairing the Singular Defects of DINOv2
Haoqi Wang, Tong Zhang, Mathieu Salzmann*
[pdf]
[DOI]

"SEA-RAFT: Simple, Efficient, Accurate RAFT for Optical Flow"
Yihan Wang*, Lahav O Lipson, Jia Deng
[pdf]
[DOI]

Learning Differentially Private Diffusion Models via Stochastic Adversarial Distillation
Bochao Liu, Pengju Wang, Shiming Ge*
[pdf]
[DOI]

General and Task-Oriented Video Segmentation
Mu Chen, Liulei Li, Wenguan Wang, Ruijie Quan, Yi Yang*
[pdf]
[DOI]

VISAGE: Video Instance Segmentation with Appearance-Guided Enhancement
Hanjung Kim, Jaehyun Kang, Miran Heo, Sukjun Hwang, Seoung Wug Oh, Seon Joo Kim*
[pdf]
[DOI]

LiFT: A Surprisingly Simple Lightweight Feature Transform for Dense ViT Descriptors
Saksham Suri*, Matthew Walmer, Kamal Gupta, Abhinav Shrivastava
[pdf]
[DOI]

ControlNet++: Improving Conditional Controls with Efficient Consistency Feedback
Ming Li*, Taojiannan Yang, Huafeng Kuang, Jie Wu, Zhaoning Wang, Xuefeng Xiao, Chen Chen
[pdf]
[DOI]

TF-FAS: Twofold-Element Fine-Grained Semantic Guidance for Generalizable Face Anti-Spoofing
Xudong Wang, Ke-Yue Zhang, Taiping Yao*, Qianyu Zhou, Shouhong Ding, Pingyang Dai*, Rongrong Ji
[pdf]
[DOI]

Prompting Future Driven Diffusion Model for Hand Motion Prediction
Bowen Tang*, Kaihao Zhang*, Wenhan Luo*, Wei Liu, HONGDONG LI
[pdf]
[DOI]

Defect Spectrum: A Granular Look of Large-scale Defect Datasets with Rich Semantics
Shuai Yang, ZhiFei Chen, Pengguang Chen, Xi Fang, Yixun Liang, Shu Liu*, Yingcong Chen*
[pdf]
[DOI]

Unveiling Advanced Frequency Disentanglement Paradigm for Low-Light Image Enhancement
Kun Zhou*, Xinyu Lin, Wenbo Li, Xiaogang Xu, Yuanhao Cai, Zhonghang Liu, Xiaoguang Han, Jiangbo Lu
[pdf]
[DOI]

RAPiD-Seg: Range-Aware Pointwise Distance Distribution Networks for 3D LiDAR Segmentation
Li Li*, Hubert P. H. Shum, Toby P Breckon
[pdf]
[DOI]

UMBRAE: Unified Multimodal Brain Decoding
Weihao Xia*, Raoul de Charette, A. Cengiz Oztireli, Jing-Hao Xue
[pdf]
[DOI]

NavGPT-2: Unleashing Navigational Reasoning Capability for Large Vision-Language Models
Gengze Zhou*, Yicong Hong, Zun Wang, Xin Eric Wang, Qi Wu
[pdf]
[DOI]

3D Single-object Tracking in Point Clouds with High Temporal Variation
Qiao Wu, Kun Sun, Pei An, Mathieu Salzmann, Yanning Zhang, Jiaqi Yang*
[pdf]
[DOI]

Adaptive Multi-task Learning for Few-shot Object Detection
Yan Ren*, Yanling Li, Adams Wai-Kin Kong
[pdf]
[DOI]

Event Trojan: Asynchronous Event-based Backdoor Attacks
Ruofei Wang*, Qing Guo, Haoliang Li, Renjie Wan*
[pdf]
[DOI]

Stepwise Multi-grained Boundary Detector for Point-supervised Temporal Action Localization
Mengnan Liu, Le Wang*, Sanping Zhou, Kun Xia, Qi Wu, Qilin Zhang, Gang Hua
[pdf]
[DOI]

Imaging Interiors: An Implicit Solution to Electromagnetic Inverse Scattering Problems
Ziyuan Luo, Boxin Shi, Haoliang Li, Renjie Wan*
[pdf]
[DOI]

Dropout Mixture Low-Rank Adaptation for Visual Parameters-Efficient Fine-Tuning
Zhengyi Fang, Yue Wang, Ran Yi*, Lizhuang Ma
[pdf]
[DOI]

OneTrack: Demystifying the Conflict Between Detection and Tracking in End-to-End 3D Trackers
Qitai Wang, Jiawei He, Yuntao Chen, Zhaoxiang Zhang*
[pdf]
[DOI]

LoA-Trans: Enhancing Visual Grounding by Location-Aware Transformers
Ziling Huang*, Shin'ichi Satoh
[pdf]
[DOI]

HAC: Hash-grid Assisted Context for 3D Gaussian Splatting Compression
Yihang Chen*, Qianyi Wu, Weiyao Lin*, Mehrtash Harandi, Jianfei Cai
[pdf]
[DOI]

Energy-induced Explicit quantification for Multi-modality MRI fusion
Xiaoming Qi*, Yuan Zhang, Tong Wang, Guanyu Yang*, Yueming Jin*, Shuo Li
[pdf]
[DOI]

ColorPeel: Color Prompt Learning with Diffusion Models via Color and Shape Disentanglement
Muhammad Atif Butt*, Kai Wang, Javier Vazquez-Corral, Joost van de Weijer
[pdf]
[DOI]

Exemplar-free Continual Representation Learning via Learnable Drift Compensation
Alex Gomez-Villa*, Dipam Goswami, Kai Wang, Andy Bagdanov, Bartlomiej Twardowski, Joost van de Weijer
[pdf]
[DOI]

Walker: Self-supervised Multiple Object Tracking by Walking on Temporal Object Appearance Graphs
Mattia Segù*, Luigi Piccinelli, Siyuan Li, Luc Van Gool, Fisher Yu, Bernt Schiele
[pdf]
[DOI]

Spatio-Temporal Proximity-Aware Dual-Path Model for Panoramic Activity Recognition
Sumin Lee*, Yooseung Wang, Sangmin Woo, Changick Kim
[pdf]
[DOI]

DiffiT: Diffusion Vision Transformers for Image Generation
Ali Hatamizadeh*, Jiaming Song, Guilin Liu, Jan Kautz, Arash Vahdat
[pdf]
[DOI]

WebRPG: Automatic Web Rendering Parameters Generation for Visual Presentation
Zirui Shao, Feiyu Gao, Hangdi Xing, Zepeng Zhu, Zhi Yu*, Jiajun Bu, Qi Zheng, Cong Yao
[pdf]
[DOI]

GPSFormer: A Global Perception and Local Structure Fitting-based Transformer for Point Cloud Understanding
Changshuo Wang*, Meiqing Wu, Siew-Kei Lam, Xin Ning, Shangshu Yu, Ruiping Wang, Weijun Li, Thambipillai Srikanthan
[pdf]
[DOI]

FreeMotion: A Unified Framework for Number-free Text-to-Motion Synthesis
Ke Fan, Junshu Tang, Weijian Cao, Ran Yi*, Moran Li, Jingyu Gong, Jiangning Zhang, Yabiao Wang, Chengjie Wang, Lizhuang Ma*
[pdf]
[DOI]

FSD-BEV: Foreground Self-Distillation for Multi-view 3D Object Detection
Zheng Jiang, Jinqing Zhang, Yanan Zhang, Qingjie Liu*, Zhenghui HU*, Baohui Wang, Yunhong Wang
[pdf]
[DOI]

SceneGraphLoc: Cross-Modal Coarse Visual Localization on 3D Scene Graphs
Yang Miao, Francis Engelmann, Olga Vysotska, Federico Tombari, Marc Pollefeys, Daniel Barath*
[pdf]
[DOI]

ScanReason: Empowering 3D Visual Grounding with Reasoning Capabilities
Chenming Zhu, Tai Wang, Wenwei Zhang, Kai Chen, Xihui Liu*
[pdf]
[DOI]

MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?
Renrui Zhang, Dongzhi Jiang, Yichi Zhang, Haokun Lin, Ziyu Guo, Pengshuo Qiu, Aojun Zhou, Pan Lu, Kai-Wei Chang, Peng Gao, Hongsheng Li*
[pdf]
[DOI]

See and Think: Embodied Agent in Virtual Environment
Zhonghan Zhao, Xuan Wang, Wenhao Chai, Boyi Li, Shengyu Hao, Shidong Cao, Tian Ye, Gaoang Wang*
[pdf]
[DOI]

PISR: Polarimetric Neural Implicit Surface Reconstruction for Textureless and Specular Objects
Guangcheng Chen*, Yicheng He, Li He, Hong Zhang
[pdf]
[DOI]

Bridging the Gap Between Human Motion and Action Semantics via Kinematics Phrases
Xinpeng Liu, Yong-Lu Li*, Ailing Zeng, Zizheng Zhou, Yang You, Cewu Lu*
[pdf]
[DOI]

VisFocus: Prompt-Guided Vision Encoders for OCR-Free Dense Document Understanding
Ofir Abramovich*, Niv Nayman*, Sharon Fogel, Inbal Lavi, Ron Litman, Shahar Tsiper, Royee Tichauer, Srikar Appalaraju, Shai Mazor, R. Manmatha
[pdf]
[DOI]

Masked Angle-Aware Autoencoder for Remote Sensing Images
Zhihao Li*, Biao Hou, Siteng Ma, zitong wu, Xianpeng Guo, bo ren, Licheng Jiao
[pdf]
[DOI]

Infinite-ID: Identity-preserved Personalization via ID-semantics Decoupling Paradigm
Yi Wu, Ziqiang Li, Heliang Zheng, Chaoyue Wang*, Bin Li*
[pdf]
[DOI]

MultiGen: Zero-shot Image Generation from Multi-modal Prompts
Zhi-Fan Wu*, Lianghua Huang, Wei Wang, Yanheng Wei, Yu Liu
[pdf]
[DOI]

GazeXplain: Learning to Predict Natural Language Explanations of Visual Scanpaths
Xianyu Chen*, Ming Jiang, Qi Zhao*
[pdf]
[DOI]

Learning Chain of Counterfactual Thought for Bias-Robust Vision-Language Reasoning
Yifeng Zhang, Ming Jiang, Qi Zhao*
[pdf]
[DOI]

SegGen: Supercharging Segmentation Models with Text2Mask and Mask2Img Synthesis
Hanrong Ye*, Jason Kuen, Qing Liu, Zhe Lin, Brian Price, Dan Xu*
[pdf]
[DOI]

Sync from the Sea: Retrieving Alignable Videos from Large-Scale Datasets
Ishan Rajendrakumar Dave*, Fabian Caba, Mubarak Shah, Simon Jenni*
[pdf]
[DOI]

FinePseudo: Improving Pseudo-Labelling through Temporal-Alignablity for Semi-Supervised Fine-Grained Action Recognition
Ishan Rajendrakumar Dave*, Mamshad Nayeem Rizve*, Mubarak Shah
[pdf]
[DOI]

Elegantly Written: Disentangling Writer and Character Styles for Enhancing Online Chinese Handwriting
Yu Liu, Fatimah binti Khalid, Lei Wang, Youxi Zhang, Cunrui Wang*
[pdf]
[DOI]

UniCode : Learning a Unified Codebook for Multimodal Large Language Models
Sipeng Zheng*, Bohan Zhou, Yicheng Feng, Ye Wang, Zongqing Lu*
[pdf]
[DOI]

When Do We Not Need Larger Vision Models?
Baifeng Shi*, Ziyang Wu, Maolin Mao, Xin Wang, Trevor Darrell
[pdf]
[DOI]

GVGEN: Text-to-3D Generation with Volumetric Representation
Xianglong He, Junyi Chen, Sida Peng, Di Huang, Yangguang Li, Xiaoshui Huang, Chun Yuan*, Wanli Ouyang, Tong He*
[pdf]
[DOI]

Bidirectional Stereo Image Compression with Cross-Dimensional Entropy Model
Zhening Liu, Xinjie Zhang, Jiawei Shao, Zehong Lin*, Jun Zhang
[pdf]
[DOI]

"UniINR: Event-guided Unified Rolling Shutter Correction, Deblurring, and Interpolation"
Yunfan Lu*, Guoqiang Liang, Yusheng Wang, Lin Wang, Hui Xiong*
[pdf]
[DOI]

ReLoo: Reconstructing Humans Dressed in Loose Garments from Monocular Video in the Wild
Chen Guo*, Tianjian Jiang, Manuel Kaufmann, Chengwei Zheng, Julien Valentin, Jie Song*, Otmar Hilliges
[pdf]
[DOI]

Weakly-supervised Camera Localization by Ground-to-satellite Image Registration
Yujiao Shi*, HONGDONG LI, Akhil Perincherry, Ankit Vora
[pdf]
[DOI]

Dataset Growth
Ziheng Qin*, zhaopan xu, YuKun Zhou, Kai Wang*, Zangwei Zheng, Zebang Cheng, Hao Tang, Lei Shang, Baigui Sun, Radu Timofte, Xiaojiang Peng, Hongxun Yao*, Yang You*
[pdf]
[DOI]

MaRINeR: Enhancing Novel Views by Matching Rendered Images with Nearby References
Lukas Bösiger*, Mihai Dusmanu, Marc Pollefeys, Zuria Bauer
[pdf]
[DOI]

Teaching Tailored to Talent: Adverse Weather Restoration via Prompt Pool and Depth-Anything Constraint
Sixiang Chen, Tian Ye, Kai Zhang, Zhaohu Xing, Yunlong Lin, Lei Zhu*
[pdf]
[DOI]

MoE-DiffIR: Task-customized Diffusion Priors for Universal Compressed Image Restoration
Yulin Ren, Xin Li*, Bingchen Li, Xingrui Wang, Mengxi China Guo, Shijie Zhao, Li Zhang, Zhibo Chen*
[pdf]
[DOI]

LEGO: Learning EGOcentric Action Frame Generation via Visual Instruction Tuning
Bolin Lai*, Xiaoliang Dai, Lawrence Chen, Guan Pang, James M Rehg, Miao Liu
[pdf]
[DOI]

SQ-LLaVA: Self-Questioning for Large Vision-Language Assistant
Guohao Sun*, Can Qin, JIAMINAN WANG, Zeyuan Chen, Ran Xu, Zhiqiang Tao
[pdf]
[DOI]

Mesh2NeRF: Direct Mesh Supervision for Neural Radiance Field Representation and Generation
Yujin Chen*, Yinyu Nie, Benjamin Ummenhofer, Reiner Birkl, Michael Paulitsch, Matthias Müller, Matthias Niessner
[pdf]
[DOI]

Listen to Look into the Future: Audio-Visual Egocentric Gaze Anticipation
Bolin Lai*, Fiona Ryan, Wenqi Jia, Miao Liu, James M Rehg
[pdf]
[DOI]

R^2-Bench: Benchmarking the Robustness of Referring Perception Models under Perturbations
Xiang Li*, Kai Qiu, Jinglu Wang, Xiaohao Xu, Kashu Yamazaki, Hao Chen, Rita Singh, Xiaonan Huang, Bhiksha Raj
[pdf]
[DOI]

Self-supervised co-salient object detection via feature correspondences at multiple scales
Souradeep Chakraborty*, Dimitris Samaras
[pdf]
[DOI]

Differentiable Convex Polyhedra Optimization from Multi-view Images
Daxuan Ren*, Haiyi Mei, Hezi Shi, Jianmin Zheng, Jianfei Cai, Lei Yang
[pdf]
[DOI]

SlotLifter: Slot-guided Feature Lifting for Learning Object-Centric Radiance Fields
Yu Liu, Baoxiong Jia*, Yixin Chen, Siyuan Huang
[pdf]
[DOI]

SceneVerse: Scaling 3D Vision-Language Learning for Grounded Scene Understanding
Baoxiong Jia*, Yixin Chen, Huangyue Yu, Yan Wang, Xuesong Niu, Tengyu Liu, Qing Li, Siyuan Huang
[pdf]
[DOI]

ADMap: Anti-disturbance Framework for Vectorized HD Map Construction
Haotian Hu, Fanyi Wang*, Yaonong Wang, Laifeng Hu, Jingwei Xu, Zhiwang Zhang*
[pdf]
[DOI]

GaussianImage: 1000 FPS Image Representation and Compression by 2D Gaussian Splatting
Xinjie Zhang, Xingtong Ge, Tongda Xu, Dailan He, Yan Wang, Hongwei Qin, Guo Lu, Jing Geng*, Jun Zhang*
[pdf]
[DOI]

PanoVOS: Bridging Non-panoramic and Panoramic Views with Transformer for Video Segmentation
Shilin Yan*, Xiaohao Xu, Renrui Zhang, Lingyi Hong, wenchao chen, Wenqiang Zhang, Wei Zhang*
[pdf]
[DOI]

Evaluating Text-to-Visual Generation with Image-to-Text Generation
Zhiqiu Lin*, Deepak Pathak, Baiqi Li, Jiayao Li, Xide Xia, Graham Neubig, Pengchuan Zhang, Deva Ramanan
[pdf]
[DOI]

SENC: Handling Self-collision in Neural Cloth Simulation
Zhouyingcheng Liao*, Sinan Wang, Taku Komura
[pdf]
[DOI]

HybridBooth: Hybrid Prompt Inversion for Efficient Subject-Driven Generation
Shanyan Guan, Yanhao Ge, Ying Tai*, Jian Yang, Wei Li, Mingyu You*
[pdf]
[DOI]

PartCraft: Crafting Creative Objects by Parts
Kam Woh Ng*, Xiatian Zhu, Yi-Zhe Song, Tao Xiang
[pdf]
[DOI]

GeometrySticker: Enabling Ownership Claim of Recolorized Neural Radiance Fields
Xiufeng HUANG*, Ka Chun Cheung, Simon See, Renjie Wan*
[pdf]
[DOI]

PYRA: Parallel Yielding Re-Activation for Training-Inference Efficient Task Adaptation
Yizhe Xiong, Hui Chen*, Tianxiang Hao, Zijia Lin, Jungong Han, Yuesong Zhang, Guoxin Wang, Yongjun Bao, Guiguang Ding
[pdf]
[DOI]

FineMatch: Aspect-based Fine-grained Image and Text Mismatch Detection and Correction
Hang Hua*, Jing Shi, Kushal Kafle, Simon Jenni, Daoan Zhang, John Collomosse, Scott Cohen, Jiebo Luo
[pdf]
[DOI]

CrossScore: A Multi-View Approach to Image Evaluation and Scoring
Zirui Wang*, Wenjing Bian, Victor Adrian Prisacariu
[pdf]
[DOI]

Modeling and Driving Human Body Soundfields through Acoustic Primitives
Chao Huang*, Dejan Markovic*, Chenliang Xu*, Alexander Richard*
[pdf]
[DOI]

m&m’s: A Benchmark to Evaluate Tool-Use for multi-step multi-modal Tasks
Zixian Ma*, Weikai Huang, Jieyu Zhang, Tanmay Gupta, Ranjay Krishna
[pdf]
[DOI]

Label-anticipated Event Disentanglement for Audio-Visual Video Parsing
Jinxing Zhou*, Dan Guo*, Yuxin Mao, Yiran Zhong, Xiaojun Chang, Meng Wang*
[pdf]
[DOI]

High-Fidelity 3D Textured Shapes Generation by Sparse Encoding and Adversarial Decoding
Qi Zuo*, Xiaodong Gu, Yuan Dong, Zhengyi Zhao, Weihao Yuan, Qiu Lingteng, Liefeng Bo, Zilong Dong
[pdf]
[DOI]

Semi-Supervised Video Desnowing Network via Temporal Decoupling Experts and Distribution-Driven Contrastive Regularization
Hongtao Wu, Angelica I Aviles-Rivero, Yijun Yang, Jingjing Ren, Sixiang Chen, Haoyu Chen, Lei Zhu*
[pdf]
[DOI]

I-MedSAM: Implicit Medical Image Segmentation with Segment Anything
Xiaobao Wei, Jiajun Cao, Yizhu Jin, Ming Lu, Guangyu Wang, Shanghang Zhang*
[pdf]
[DOI]

ReMamber: Referring Image Segmentation with Mamba Twister
Yuhuan Yang, Chaofan Ma, Jiangchao Yao, Zhun Zhong*, Ya Zhang, Yanfeng Wang*
[pdf]
[DOI]

TalkingGaussian: Structure-Persistent 3D Talking Head Synthesis via Gaussian Splatting
Jiahe Li, Jiawei Zhang, Xiao Bai*, Jin Zheng*, Xin Ning, Jun Zhou, Lin Gu
[pdf]
[DOI]

CAT: Enhancing Multimodal Large Language Model to Answer Questions in Dynamic Audio-Visual Scenarios
Qilang Ye, Zitong Yu*, Rui Shao, Xinyu Xie, Philip Torr, Xiaochun Cao
[pdf]
[DOI]

Segmentation-guided Layer-wise Image Vectorization with Gradient Fills
Hengyu Zhou, Hui Zhang*, Bin Wang*
[pdf]
[DOI]

Implicit Style-Content Separation using B-LoRA
Yarden Frenkel*, Yael Vinker, Ariel Shamir, Danny Cohen-Or
[pdf]
[DOI]

OpenPSG: Open-set Panoptic Scene Graph Generation via Large Multimodal Models
Zijian Zhou*, Zheng Zhu, Holger Caesar, Miaojing Shi*
[pdf]
[DOI]

ActionVOS: Actions as Prompts for Video Object Segmentation
Liangyang Ouyang*, Ruicong Liu, Yifei Huang*, Ryosuke Furuta, Yoichi Sato*
[pdf]
[DOI]

FALIP: Visual Prompt as Foveal Attention Boosts CLIP Zero-Shot Performance
Jiedong Zhuang, Jiaqi Hu, Lianrui Mu, Rui Hu, Xiaoyu Liang, Jiangnan Ye, Haoji Hu*
[pdf]
[DOI]

U-COPE: Taking a Further Step to Universal 9D Category-level Object Pose Estimation
li zhang*, Weiqing Meng, Yan Zhong, Bin Kong, Mingliang Xu, Jianming Du, Xue Wang, Rujing Wang, Liu Liu
[pdf]
[DOI]

Integrating Markov Blanket Discovery into Causal Representation Learning for Domain Generalization
Naiyu Yin*, Hanjing Wang, Yue Yu, Tian Gao, Amit Dhurandhar, Qiang Ji
[pdf]
[DOI]

Rotary Position Embedding for Vision Transformer
Byeongho Heo*, Song Park, Dongyoon Han, Sangdoo Yun
[pdf]
[DOI]

Local All-Pair Correspondence for Point Tracking
Seokju Cho, Jiahui Huang, Jisu Nam, Honggyu An, Seungryong Kim*, Joon-Young Lee*
[pdf]
[DOI]

MonoWAD: Weather-Adaptive Diffusion Model for Robust Monocular 3D Object Detection
Youngmin Oh, Hyung-Il Kim, Seong Tae Kim*, Jung Uk Kim*
[pdf]
[DOI]

ReALFRED: An Embodied Instruction Following Benchmark in Photo-Realistic Environments
Taewoong Kim, Cheolhong Min, Byeonghwi Kim, Jinyeon Kim, Wonje Jeung, Jonghyun Choi*
[pdf]
[DOI]

S^3D-NeRF: Single-Shot Speech-Driven Neural Radiance Field for High Fidelity Talking Head Synthesis
Dongze Li*, Kang Zhao*, Wei Wang*, Yifeng Ma, Bo Peng, Yingya Zhang, Jing Dong
[pdf]
[DOI]

ActionSwitch: Class-agnostic Detection of Simultaneous Actions in Streaming Videos
Hyolim Kang, Jeongseok Hyun, Joungbin An, Youngjae Yu, Seon Joo Kim*
[pdf]
[DOI]

Hierarchically Structured Neural Bones for Reconstructing Animatable Objects from Casual Videos
Subin Jeon, In Cho, Minsu Kim, Woong Oh Cho, Seon Joo Kim*
[pdf]
[DOI]

PQ-SAM: Post-training Quantization for Segment Anything Model
Xiaoyu Liu*, Xin Ding, Lei Yu, Yuanyuan Xi, Wei Li, Zhijun Tu, jie hu, Hanting Chen, Baoqun YIN, Zhiwei Xiong*
[pdf]
[DOI]

CPM: Class-conditional Prompting Machine for Audio-visual Segmentation
Yuanhong Chen*, Chong Wang, Yuyuan Liu, Hu Wang, Gustavo Carneiro
[pdf]
[DOI]

Optimizing Factorized Encoder Models: Time and Memory Reduction for Scalable and Efficient Action Recognition
Shreyank N Gowda*, Anurag Arnab, Jonathan Huang
[pdf]
[DOI]

DVLO: Deep Visual-LiDAR Odometry with Local-to-Global Feature Fusion and Bi-Directional Structure Alignment
Jiuming Liu, Dong Zhuo, Zhiheng Feng, Siting Zhu, Chensheng Peng, Zhe Liu, Hesheng Wang*
[pdf]
[DOI]

CoLeaF: A Contrastive-Collaborative Learning Framework for Weakly Supervised Audio-Visual Video Parsing
Faegheh Sardari*, Armin Mustafa, Philip JB Jackson, Adrian Hilton
[pdf]
[DOI]

Noise-assisted Prompt Learning for Image Forgery Detection and Localization
Dong Li, Jiaying Zhu, Xueyang Fu*, Xun Guo, Yidi Liu, Gang Yang, Jiawei Liu, Zheng-Jun Zha
[pdf]
[DOI]

Data Collection-free Masked Video Modeling
Yuchi Ishikawa*, Masayoshi Kondo, Yoshimitsu Aoki
[pdf]
[DOI]

Protecting NeRFs' Copyright via Plug-And-Play Watermarking Base Model
Qi Song*, Ziyuan Luo, Ka Chun Cheung, Simon See, Renjie Wan
[pdf]
[DOI]

Pixel-Aware Stable Diffusion for Realistic Image Super-Resolution and Personalized Stylization
Tao Yang*, Rongyuan Wu, Peiran Ren, Xuansong Xie, Lei Zhang
[pdf]
[DOI]

AnyControl: Create Your Artwork with Versatile Control on Text-to-Image Generation
Yanan Sun*, Yanchen Liu, Yinhao Tang, Wenjie Pei, Kai Chen
[pdf]
[DOI]

SEED: A Simple and Effective 3D DETR in Point Clouds
Zhe Liu, Jinghua Hou, Xiaoqing Ye, Tong Wang, Jingdong Wang, Xiang Bai*
[pdf]
[DOI]

AEDNet: Adaptive Embedding and Multiview-Aware Disentanglement for Point Cloud Completion
Zhiheng Fu, Longguang Wang, Lian Xu, Zhiyong Wang, Hamid Laga, Yulan Guo*, Farid Boussaid, Mohammed Bennamoun
[pdf]
[DOI]

Synergy of Sight and Semantics: Visual Intention Understanding with CLIP
Qu Yang, Mang Ye*, Dacheng Tao
[pdf]
[DOI]

Intrinsic Single-Image HDR Reconstruction
Sebastian Dille*, Chris Careaga*, Yagiz Aksoy
[pdf]
[DOI]

T-MAE: Temporal Masked Autoencoders for Point Cloud Representation Learning
Weijie Wei*, Fatemeh Karimi Nejadasl, Theo Gevers, Martin R. Oswald*
[pdf]
[DOI]

Pathology-knowledge Enhanced Multi-instance Prompt Learning for Few-shot Whole Slide Image Classification
Linhao Qu*, Dingkang Yang, Dan Huang, Qinhao Guo, rongkui luo, Shaoting Zhang, Xiaosong Wang*
[pdf]
[DOI]

Towards Natural Language-Guided Drones: GeoText-1652 Benchmark with Spatial Relation Matching
Meng Chu, Zhedong Zheng*, Wei Ji, Tingyu Wang, Tat-Seng Chua
[pdf]
[DOI]

BEAF: Observing BEfore-AFter Changes to Evaluate Hallucination in Vision-language Models
Moon Ye-Bin, Nam Hyeon-Woo, Wonseok Choi, Tae-Hyun Oh*
[pdf]
[DOI]

Approaching Outside: Scaling Unsupervised 3D Object Detection from 2D Scene
Ruiyang Zhang*, Hu Zhang, Hang Yu, Zhedong Zheng*
[pdf]
[DOI]

DATENeRF: Depth-Aware Text-based Editing of NeRFs
Sara Rojas Martinez*, Julien Philip, Kai Zhang, Sai Bi, Fujun Luan, Bernard Ghanem, Kalyan Sunkavalli
[pdf]
[DOI]

XPSR: Cross-modal Priors for Diffusion-based Image Super-Resolution
Qu Yunpeng*, Kun Yuan, Kai Zhao, Qizhi Xie, Jinhua Hao, Ming Sun, Chao Zhou
[pdf]
[DOI]

ABC Easy as 123: A Blind Counter for Exemplar-Free Multi-Class Class-agnostic Counting
Michael A Hobley*, Victor Adrian Prisacariu
[pdf]
[DOI]

Category Adaptation Meets Projected Distillation in Generalized Continual Category Discovery
Grzegorz Rypeść*, Daniel Marczak, Sebastian Cygert, Tomasz Trzcinski, Bartlomiej Twardowski
[pdf]
[DOI]

LaRa: Efficient Large-Baseline Radiance Fields
Anpei Chen*, Haofei Xu, Stefano Esposito, Siyu Tang, Andreas Geiger
[pdf]
[DOI]

Bi-TTA: Bidirectional Test-Time Adapter for Remote Physiological Measurement
Haodong LI*, Hao LU, Yingcong Chen*
[pdf]
[DOI]

MAGR: Manifold-Aligned Graph Regularization for Continual Action Quality Assessment
Kanglei Zhou, Liyuan Wang, Xingxing Zhang, Hubert P. H. Shum, Frederick W. B. Li, Jianguo Li, Xiaohui Liang*
[pdf]
[DOI]

Grounding Language Models for Visual Entity Recognition
Zilin Xiao*, Ming Gong, Paola Cascante-Bonilla, Xingyao Zhang, Jie Wu, Vicente Ordonez*
[pdf]
[DOI]

ELSE: Efficient Deep Neural Network Inference through Line-based Sparsity Exploration
Zeqi Zhu*, Alberto Garcia-Ortiz, Luc Waeijen, Egor Bondarev, Arash Pourtaherian, Orlando Moreira
[pdf]
[DOI]

DiffusionDepth: Diffusion Denoising Approach for Monocular Depth Estimation
Yiqun Duan*, Xianda Guo*, Zheng Zhu
[pdf]
[DOI]

DC-Solver: Improving Predictor-Corrector Diffusion Sampler via Dynamic Compensation
Wenliang Zhao, Haolin Wang, Jie Zhou, Jiwen Lu*
[pdf]
[DOI]

TRAM: Global Trajectory and Motion of 3D Humans from in-the-wild Videos
Yufu Wang*, Ziyun Wang, Lingjie Liu, Kostas Daniilidis
[pdf]
[DOI]

MutDet: Mutually Optimizing Pre-training for Remote Sensing Object Detection
Ziyue Huang, Yongchao Feng, Qingjie Liu*, Yunhong Wang
[pdf]
[DOI]

Self-Supervised Video Copy Localization with Regional Token Representation
Minlong Lu*, Yichen Lu, Siwei Nie, Xudong Yang, Xiaobo Zhang
[pdf]
[DOI]

Enhancing Perceptual Quality in Video Super-Resolution through Temporally-Consistent Detail Synthesis using Diffusion Models
Claudio Rota*, Marco Buzzelli, Joost van de Weijer
[pdf]
[DOI]

RoGUENeRF: A Robust Geometry-Consistent Universal Enhancer for NeRF
Sibi Catley-Chandar*, Richard Shaw, Gregory Slabaugh, Eduardo Pérez Pellitero
[pdf]
[DOI]

Bridging the Gap: Studio-like Avatar Creation from a Monocular Phone Capture
ShahRukh Athar*, Shunsuke Saito, Stanislav Pidhorskyi, Zhengyu Yang, Chen Cao
[pdf]
[DOI]

ControlLLM: Augment Language Models with Tools by Searching on Graphs
Zhaoyang Liu, Zeqiang Lai, Zhangwei Gao, erfei cui, Ziheng Li, Xizhou Zhu, Lewei Lu, Qifeng Chen*, Yu Qiao, Jifeng Dai, Wenhai Wang*
[pdf]
[DOI]

UniTraj: A Unified Framework for Scalable Vehicle Trajectory Prediction
Lan Feng, Mohammadhossein Bahari*, Kaouther Messaoud, Eloi Zablocki, Matthieu Cord, Alexandre Alahi
[pdf]
[DOI]

DreamDissector: Learning Disentangled Text-to-3D Generation from 2D Diffusion Priors
Zizheng Yan*, Jiapeng Zhou, Fanpeng Meng, Yushuang Wu, Lingteng Qiu, Zisheng Ye, Shuguang Cui, Guanying CHEN, Xiaoguang Han*
[pdf]
[DOI]

Vamos: Versatile Action Models for Video Understanding
Shijie Wang*, Qi Zhao, Minh Quan Do, Nakul Agarwal, Kwonjoon Lee, Chen Sun
[pdf]
[DOI]

Prioritized Semantic Learning for Zero-shot Instance Navigation
xinyu sun*, Lizhao Liu, Hongyan Zhi, Ronghe Qiu, Junwei Liang*
[pdf]
[DOI]

RoadPainter: Points Are Ideal Navigators for Topology transformER
Zhongxing Ma, Liang Shuang, Yongkun Wen, Weixin Lu, Guowei Wan*
[pdf]
[DOI]

FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis
Linjiang Huang*, Rongyao Fang, Aiping Zhang, Guanglu Song, Si Liu, Yu Liu, Hongsheng Li*
[pdf]
[DOI]

Can OOD Object Detectors Learn from Foundation Models?
Jiahui Liu*, Xin Wen, Shizhen Zhao, Yingxian Chen, Xiaojuan Qi*
[pdf]
[DOI]

Videoshop: Localized Semantic Video Editing with Noise-Extrapolated Diffusion Inversion
Xiang Fan*, Anand Bhattad, Ranjay Krishna
[pdf]
[DOI]

MERLiN: Single-Shot Material Estimation and Relighting for Photometric Stereo
Ashish Tiwari*, Satoshi Ikehata, Shanmuganathan Raman
[pdf]
[DOI]

Boosting 3D Single Object Tracking with 2D Matching Distillation and 3D Pre-training
Qiangqiang Wu, Yan Xia*, Jia Wan, Antoni Chan
[pdf]
[DOI]

Diffusion-Based Image-to-Image Translation by Noise Correction via Prompt Interpolation
Junsung Lee, Minsoo Kang, Bohyung Han*
[pdf]
[DOI]

Real-data-driven 2000 FPS Color Video from Mosaicked Chromatic Spikes
Siqi Yang*, Zhaojun Huang, Yakun Chang, Bin Fan, Zhaofei Yu, Boxin Shi
[pdf]
[DOI]

Brain-ID: Learning Contrast-agnostic Anatomical Representations for Brain Imaging
Peirong Liu*, Oula Puonti, Xiaoling Hu, Daniel C. Alexander, Juan E. Iglesias
[pdf]
[DOI]

TTT-MIM: Test-Time Training with Masked Image Modeling for Denoising Distribution Shifts
Youssef Mansour*, Xuyang Zhong, Serdar Caglar, Reinhard Heckel
[pdf]
[DOI]

RadEdit: stress-testing biomedical vision models via diffusion image editing
Fernando Pérez-García, Sam Bond-Taylor, Pedro Sanchez, Boris van Breugel, Daniel Coelho de Castro, Harshita Sharma, Valentina Salvatelli, Maria Teodora A Wetscherek, Hannah CM Richardson, Lungren Matthew, Aditya Nori, Javier Alvarez-Valle, Ozan Oktay, Maximilian Ilse*
[pdf]
[DOI]

SPAMming Labels: Efficient Annotations for the Trackers of Tomorrow
Orcun Cetintas*, Tim Meinhardt, Guillem Brasó, Laura Leal-Taixé
[pdf]
[DOI]

AdaDiffSR: Adaptive Region-aware Dynamic acceleration Diffusion Model for Real-World Image Super-Resolution
Yuanting Fan, Chengxu Liu, Nengzhong Yin, Changlong Gao, Xueming Qian*
[pdf]
[DOI]

Explicitly Guided Information Interaction Network for Cross-modal Point Cloud Completion
Xu Hang, Chen Long, Wenxiao Zhang*, Yuan Liu, Zhen Cao, Zhen Dong, Bisheng Yang
[pdf]
[DOI]

Towards Real-world Event-guided Low-light Video Enhancement and Deblurring
Taewoo Kim, Jaeseok Jeong, Hoonhee Cho, Yuhwan Jeong, Kuk-Jin Yoon*
[pdf]
[DOI]

Exploring Pre-trained Text-to-Video Diffusion Models for Referring Video Object Segmentation
Xuelu Feng, Dongdong Chen, Junsong Yuan, Chunming Qiao, Gang Hua, Zixin Zhu*
[pdf]
[DOI]

TrackNeRF: Bundle Adjusting NeRF from Sparse and Noisy Views via Feature Tracks
Jinjie Mai*, Wenxuan Zhu, Sara Rojas, Jesus Zarzar, Abdullah Hamdi, Guocheng Qian, Bing Li, Silvio Giancola, Bernard Ghanem
[pdf]
[DOI]

COHO: Context-Sensitive City-Scale Hierarchical Urban Layout Generation
Liu He*, Daniel Aliaga
[pdf]
[DOI]

Joint RGB-Spectral Decomposition Model Guided Image Enhancement in Mobile Photography
Kailai Zhou*, Lijing Cai, Yibo Wang, Mengya Zhang, Bihan Wen, Qiu Shen*, Xun Cao
[pdf]
[DOI]

SpatialFormer: Towards Generalizable Vision Transformers with Explicit Spatial Understanding
Han Xiao, Wenzhao Zheng, Sicheng Zuo, Peng Gao, Jie Zhou, Jiwen Lu*
[pdf]
[DOI]

OccWorld: Learning a 3D Occupancy World Model for Autonomous Driving
Wenzhao Zheng, Weiliang Chen, Yuanhui Huang, Borui Zhang, Yueqi Duan, Jiwen Lu*
[pdf]
[DOI]

MyVLM: Personalizing VLMs for User-Specific Queries
Yuval Alaluf*, Elad Richardson, Sergey Tulyakov, Kfir Aberman, Danny Cohen-Or
[pdf]
[DOI]

AMEGO: Active Memory from long EGOcentric videos
Gabriele Goletto*, Tushar Nagarajan, Giuseppe Averta, Dima Damen
[pdf]
[DOI]

Power Variable Projection for Initialization-Free Large-Scale Bundle Adjustment
Simon Weber*, Je Hyeong Hong, Daniel Cremers
[pdf]
[DOI]

Collaborative Control for Geometry-Conditioned PBR Image Generation
Shimon Vainer, Mark Boss, Mathias Parger, Konstantin Kutsy, Dante De Nigris, Ciara Rowles, Nicolas Perony, Simon Donné*
[pdf]
[DOI]

Co-synthesis of Histopathology Nuclei Image-Label Pairs using a Context-Conditioned Joint Diffusion Model
Seonghui Min, Hyun-Jic Oh, Won-Ki Jeong*
[pdf]
[DOI]

One-stage Prompt-based Continual Learning
Youngeun Kim*, Yuhang Li, Priyadarshini Panda
[pdf]
[DOI]

SpaceJAM: a Lightweight and Regularization-free Method for Fast Joint Alignment of Images
Nir Barel*, Ron A Shapira Weber*, Nir Mualem, Shahaf E Finder, Oren Freifeld*
[pdf]
[DOI]

APL: Anchor-based Prompt Learning for One-stage Weakly Supervised Referring Expression Comprehension
Yaxin Luo, Jiayi Ji, Xiaofu Chen, Yuxin Zhang, Tianhe Ren, Gen Luo*
[pdf]
[DOI]

GenQ: Quantization in Low Data Regimes with Generative Synthetic Data
Yuhang Li*, Youngeun Kim, Donghyun Lee, Souvik Kundu, Priyadarshini Panda
[pdf]
[DOI]

MVDD: Multi-View Depth Diffusion Models
Zhen Wang*, Qiangeng Xu, Feitong Tan, Menglei Chai, Shichen Liu, Rohit Pandey, Sean Fanello, Achuta Kadambi, Yinda Zhang
[pdf]
[DOI]

Rethinking Video-Text Understanding: Retrieval from Counterfactually Augmented Data
Wufei Ma*, Kai Li, Zhongshi Jiang, Moustafa Meshry, Qihao Liu, Huiyu Wang, Christian Haene, Alan Yuille
[pdf]
[DOI]

Risk-Aware Self-Consistent Imitation Learning for Trajectory Planning in Autonomous Driving
Yixuan Fan*, Ya-Li Li, Shengjin Wang*
[pdf]
[DOI]

Dual-level Adaptive Self-Labeling for Novel Class Discovery in Point Cloud Segmentation
Ruijie Xu*, CHUYU ZHANG, Hui Ren, Xuming He
[pdf]
[DOI]

EBDM: Exemplar-guided Image Translation with Brownian-bridge Diffusion Models
Eungbean Lee, Somi Jeong, Kwanghoon Sohn*
[pdf]
[DOI]

DreamDrone: Text-to-Image Diffusion Models are Zero-shot Perpetual View Generators
Hanyang Kong*, Dongze Lian, Michael Bi Mi, Xinchao Wang*
[pdf]
[DOI]

Harnessing Text-to-Image Diffusion Models for Category-Agnostic Pose Estimation
Duo Peng, Zhengbo Zhang, Ping Hu, Qiuhong Ke, David Yau, Jun Liu*
[pdf]
[DOI]

SC4D: Sparse-Controlled Video-to-4D Generation and Motion Transfer
Zijie Wu*, Chaohui Yu, Yanqin Jiang, Chenjie Cao, Fan Wang, Xiang Bai*
[pdf]
[DOI]

Overcoming Distribution Mismatch in Quantizing Image Super-Resolution Networks
Cheeun Hong, Kyoung Mu Lee*
[pdf]
[DOI]

Large Motion Model for Unified Multi-Modal Motion Generation
Mingyuan Zhang*, Daisheng Jin, Chenyang Gu, Fangzhou Hong, Zhongang Cai, Jingfang Huang, Chongzhi Zhang, Xinying Guo, Lei Yang, Ying He, Ziwei Liu*
[pdf]
[DOI]

FisherRF: Active View Selection and Mapping with Radiance Fields using Fisher Information
Wen Jiang*, BOSHU LEI, Kostas Daniilidis*
[pdf]
[DOI]

Occlusion Handling in 3D Human Pose Estimation with Perturbed Positional Encoding
Niloofar Azizi*, Mohsen Fayyaz, Horst Bischof
[pdf]
[DOI]

Gradient-based Out-of-Distribution Detection
Taha Entesari*, Sina Sharifi*, Bardia Safaei*, Vishal Patel, Mahyar Fazlyab
[pdf]
[DOI]

Event-based Mosaicing Bundle Adjustment
Shuang Guo*, Guillermo Gallego
[pdf]
[DOI]

ProMerge: Prompt and Merge for Unsupervised Instance Segmentation
Dylan J Li, Gyungin Shin*
[pdf]
[DOI]

M2D2M: Multi-Motion Generation from Text with Discrete Diffusion Models
Seunggeun Chi*, Hyung-gun Chi, Hengbo Ma, Nakul Agarwal, Faizan Siddiqui, Karthik Ramani*, Kwonjoon Lee*
[pdf]
[DOI]

The Hard Positive Truth about Vision-Language Compositionality
Amita Kamath*, Cheng-Yu Hsieh, Kai-Wei Chang, Ranjay Krishna
[pdf]
[DOI]

GaussCtrl: Multi-View Consistent Text-Driven 3D Gaussian Splatting Editing
Jing Wu*, Jia-Wang Bian, Xinghui Li, Guangrun Wang, Ian Reid, Philip Torr, Victor Adrian Prisacariu*
[pdf]
[DOI]

Shapefusion: 3D localized human diffusion models
Rolandos Alexandros Potamias*, Michael Tarasiou, Stylianos Ploumpis, Stefanos Zafeiriou
[pdf]
[DOI]

Eta Inversion: Designing an Optimal Eta Function for Diffusion-based Real Image Editing
Wonjun Kang, Kevin Galim, Hyung Il Koo*
[pdf]
[DOI]

Prompting Language-Informed Distribution for Compositional Zero-Shot Learning
Wentao Bao*, Lichang Chen, Heng Huang, Yu Kong
[pdf]
[DOI]

Wear-Any-Way: Manipulable Virtual Try-on via Sparse Correspondence Alignment
Mengting Chen*, Xi Chen, Zhonghua Zhai, Chen Ju, Xuewen Hong, Jinsong Lan, Shuai Xiao
[pdf]
[DOI]

3iGS: Factorised Tensorial Illumination for 3D Gaussian Splatting
Zhe Jun Tang*, Tat-Jen Cham
[pdf]
[DOI]

Distribution-Aware Robust Learning from Long-Tailed Data with Noisy Labels
Jae Soon Baik*, In Young Yoon, Kun Hoon Kim, Jun Won Choi*
[pdf]
[DOI]

Free-Viewpoint Video of Outdoor Sports Using a Drone
Zhengdong Hong*
[pdf]
[DOI]

Wavelength-Embedding-guided Filter-Array Transformer for Spectral Demosaicing
Haijin Zeng*, Hiep Luong, Wilfried Philips
[pdf]
[DOI]

ConGeo: Robust Cross-view Geo-localization across Ground View Variations
Li Mi, Chang Xu*, Javiera Castillo Navarro, SYRIELLE MONTARIOL, Wen Yang, Antoine Bosselut, Devis Tuia
[pdf]
[DOI]

Generalizable Facial Expression Recognition
Yuhang Zhang, Xiuqi Zheng, Chenyi Liang, Jiani Hu*, Weihong Deng
[pdf]
[DOI]

GAURA: Generalizable Approach for Unified Restoration and Rendering of Arbitrary Views
Vinayak Gupta*, Rongali Simhachala Venkata Girish, Mukund Varma T, Ayush Tewari, Kaushik Mitra
[pdf]
[DOI]

Self-Supervised Any-Point Tracking by Contrastive Random Walks
Ayush Shrivastava*, Andrew Owens
[pdf]
[DOI]

MixDQ: Memory-Efficient Few-Step Text-to-Image Diffusion Models with Metric-Decoupled Mixed Precision Quantization
Tianchen Zhao*, Xuefei Ning, Tongcheng Fang, Enshu Liu, Guyue Huang, Zinan Lin, Shengen Yan, Guohao Dai, Yu Wang
[pdf]
[DOI]

Siamese Vision Transformers are Scalable Audio-visual Learners
Yan-Bo Lin*, Gedas Bertasius
[pdf]
[DOI]

LCM-Lookahead for Encoder-based Text-to-Image Personalization
Rinon Gal*, Or Lichter, Elad Richardson, Or Patashnik, Amit Bermano, Gal Chechik, Danny Cohen-Or
[pdf]
[DOI]

Towards Architecture-Agnostic Untrained Networks Priors for Image Reconstruction with Frequency Regularization
Yilin Liu, Yunkui Pang, Jiang Li, Yong Chen, Pew-Thian Yap*
[pdf]
[DOI]

Towards Open-Ended Visual Recognition with Large Language Models
Qihang Yu*, Xiaohui Shen, Liang-Chieh Chen
[pdf]
[DOI]

Ray-Distance Volume Rendering for Neural Scene Reconstruction
Ruihong Yin*, Yunlu Chen, Sezer Karaoglu, Theo Gevers
[pdf]
[DOI]

ReNoise: Real Image Inversion Through Iterative Noising
Daniel Garibi*, Or Patashnik, Andrey Voynov, Hadar Averbuch-Elor, Danny Cohen-Or
[pdf]
[DOI]

Attention Decomposition for Cross-Domain Semantic Segmentation
Liqiang He*, Sinisa Todorovic
[pdf]
[DOI]

Be Yourself: Bounded Attention for Multi-Subject Text-to-Image Generation
Omer Dahary*, Or Patashnik, Kfir Aberman, Danny Cohen-Or
[pdf]
[DOI]

Handling The Non-Smooth Challenge in Tensor SVD: A Multi-Objective Tensor Recovery Framework
Jingjing Zheng, Wanglong Lu, Wenzhe Wang, Yankai Cao*, Xiaoqin Zhang, Xianta Jiang
[pdf]
[DOI]

RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models
Bowen Zhang, Yiji Cheng, Chunyu Wang*, Ting Zhang, Jiaolong Yang, Yansong Tang, Feng Zhao, Dong Chen, Baining Guo
[pdf]
[DOI]

GRM: Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation
Yinghao Xu*, Zifan Shi, Wang Yifan, Hansheng Chen, Ceyuan Yang, Sida Peng, Yujun Shen, Gordon Wetzstein
[pdf]
[DOI]

IRGen: Generative Modeling for Image Retrieval
Yidan Zhang*, Ting Zhang*, Dong Chen, Yujing Wang, Qi Chen, Xing Xie, Hao Sun, Weiwei Deng, Qi Zhang, Fan Yang, Mao Yang, Qingmin Liao, Jingdong Wang, Baining Guo
[pdf]
[DOI]

Learning Trimodal Relation for Audio-Visual Question Answering with Missing Modality
Kyu Ri Park, Hong Joo Lee*, Jung Uk Kim*
[pdf]
[DOI]

FastCAD: Real-Time CAD Retrieval and Alignment from Scans and Videos
Florian Maximilian Langer*, Jihong Ju, Georgi Dikov, Gerhard Reitmayr, Mohsen Ghafoorian
[pdf]
[DOI]

A Simple Latent Diffusion Approach for Panoptic Segmentation and Mask Inpainting
Wouter Van Gansbeke*, Bert De Brabandere
[pdf]
[DOI]

VISA: Reasoning Video Object Segmentation via Large Language Model
Cilin Yan, Haochen Wang, Shilin Yan, Xiaolong Jiang, Yao Hu, Guoliang Kang*, Weidi Xie, Efstratios Gavves
[pdf]
[DOI]

Lego: Learning to Disentangle and Invert Personalized Concepts Beyond Object Appearance in Text-to-Image Diffusion Models
Saman Motamed*, Danda Pani Paudel, Luc Van Gool
[pdf]
[DOI]

IDOL: Unified Dual-Modal Latent Diffusion for Human-Centric Joint Video-Depth Generation
Yuanhao Zhai*, Kevin Lin, Linjie Li, Chung-Ching Lin, Jianfeng Wang, Zhengyuan Yang, David Doermann, Junsong Yuan, Zicheng Liu, Lijuan Wang
[pdf]
[DOI]

Scaling Backwards: Minimal Synthetic Pre-training?
Ryo Nakamura*, Ryu Tadokoro*, Ryosuke Yamada*, Yuki M Asano*, Iro Laina*, Christian Rupprecht*, Nakamasa Inoue*, Rio Yokota*, Hirokatsu Kataoka*
[pdf]
[DOI]

BAMM: Bidirectional Autoregressive Motion Model
Ekkasit Pinyoanuntapong*, Muhammad Usama Saleem, Pu Wang, Minwoo Lee, Srijan Das, Chen Chen
[pdf]
[DOI]

Event-based Head Pose Estimation: Benchmark and Method
Jiahui Yuan*, Hebei Li, Yansong Peng, Jin Wang, Yuheng Jiang, Yueyi Zhang*, Xiaoyan Sun
[pdf]
[DOI]

Avatar Fingerprinting for Authorized Use of Synthetic Talking-Head Videos
Ekta Prashnani*, Koki Nagano, Shalini De Mello, David P Luebke, Orazio Gallo
[pdf]
[DOI]

Towards Multi-modal Transformers in Federated Learning
Guangyu Sun*, Matias Mendieta, Aritra Dutta, Xin Li, Chen Chen
[pdf]
[DOI]

Fisher Calibration for Backdoor-Robust Heterogeneous Federated Learning
Wenke Huang, Mang Ye*, zekun shi, Bo Du*, Dacheng Tao
[pdf]
[DOI]

QueryCDR: Query-based Controllable Distortion Rectification Network for Fisheye Images
Pengbo Guo, Chengxu Liu, Xingsong Hou*, Xueming Qian
[pdf]
[DOI]

Latent-INR: A Flexible Framework for Implicit Representations of Videos with Discriminative Semantics
Shishira R Maiya*, Anubhav Gupta, Matthew A Gwilliam, Max Ehrlich, Abhinav Shrivastava
[pdf]
[DOI]

DCDM: Diffusion-Conditioned-Diffusion Model for Scene Text Image Super-Resolution
Shrey Singh*, Prateek Keserwani, Masakazu Iwamura*, Partha Pratim Roy
[pdf]
[DOI]

Per-Gaussian Embedding-Based Deformation for Deformable 3D Gaussian Splatting
Jeongmin Bae, Seoha Kim, Youngsik Yun, Hahyun Lee, Gun Bang, Youngjung Uh*
[pdf]
[DOI]

DreamMover: Leveraging the Prior of Diffusion Models for Image Interpolation with Large Motion
Liao Shen, Tianqi Liu, Huiqiang Sun, Xinyi Ye, Baopu Li, Jianming Zhang, Zhiguo Cao*
[pdf]
[DOI]

CoLA: Conditional Dropout and Language-driven Robust Dual-modal Salient Object Detection
Shuang Hao, Chunlin Zhong, He Tang*
[pdf]
[DOI]

Image-Feature Weak-to-Strong Consistency: An Enhanced Paradigm for Semi-Supervised Learning
Zhiyu Wu*, Jinshi Cui*
[pdf]
[DOI]

RPBG: Towards Robust Neural Point-based Graphics in the Wild
Qingtian Zhu, Zizhuang Wei, Zhongtian Zheng, Yifan Zhan, Zhuyu Yao, Jiawang Zhang, Kejian Wu, Yinqiang Zheng*
[pdf]
[DOI]

GaussReg: Fast 3D Registration with Gaussian Splatting
Jiahao Chang*, Yinglin Xu, Yihao Li, Yuantao Chen, Wensen Feng, Xiaoguang Han
[pdf]
[DOI]

Efficient Diffusion Transformer with Step-wise Dynamic Attention Mediators
Yifan Pu*, Zhuofan Xia, Jiayi Guo, Dongchen Han, Qixiu Li, Duo Li, Yuhui Yuan, Ji Li, Yizeng Han, Shiji Song, Gao Huang*, Xiu Li*
[pdf]
[DOI]

Open Vocabulary 3D Scene Understanding via Geometry Guided Self-Distillation
Pengfei Wang*, Yuxi Wang, Shuai Li, Zhaoxiang Zhang, Zhen Lei, Lei Zhang
[pdf]
[DOI]

IAM-VFI : Interpolate Any Motion for Video Frame Interpolation with motion complexity map
Kihwan Yoon*, Yong Han Kim, Sungjei Kim*, Jinwoo Jeong*
[pdf]
[DOI]

TIP: Tabular-Image Pre-training for Multimodal Classification with Incomplete Data
Siyi Du*, Shaoming Zheng, Yinsong Wang, Wenjia Bai, Declan P. O'Regan, Chen Qin*
[pdf]
[DOI]

Diffusion Model is a Good Pose Estimator from 3D RF-Vision
Junqiao Fan, Jianfei Yang*, Yuecong Xu, Lihua Xie
[pdf]
[DOI]

UPose3D: Uncertainty-Aware 3D Human Pose Estimation with Cross-View and Temporal Cues
Vandad Davoodnia*, Saeed Ghorbani, Marc-André Carbonneau, Alexandre Messier, Ali Etemad
[pdf]
[DOI]

Learning 3D-aware GANs from Unposed Images with Template Feature Field
Xinya Chen, Hanlei Guo, Yanrui Bin, Shangzhan Zhang, Yuanbo Yang, Yujun Shen, Yue Wang, Yiyi Liao*
[pdf]
[DOI]

TAPTR: Tracking Any Point with Transformers as Detection
Hongyang Li*, Hao Zhang, Shilong Liu, Zhaoyang Zeng, Tianhe Ren, Feng Li, Lei Zhang*
[pdf]
[DOI]

Token Compensator: Altering Inference Cost of Vision Transformer without Re-Tuning
Shibo Jie, Yehui Tang, Jianyuan Guo, Zhi-Hong Deng*, Kai Han*, Yunhe Wang*
[pdf]
[DOI]

Point-supervised Panoptic Segmentation via Estimating Pseudo Labels from Learnable Distance
Jing Li, Junsong Fan*, Zhaoxiang Zhang*
[pdf]
[DOI]

BRAVE: Broadening the visual encoding of vision-language models
Oğuzhan Fatih Kar*, Alessio Tonioni*, Petra Poklukar, Achin Kulshrestha, Amir Zamir, Federico Tombari
[pdf]
[DOI]

HUMOS: Human Motion Model Conditioned on Body Shape
Shashank Tripathi*, Omid Taheri, Christoph Lassner*, Michael J. Black*, Daniel Holden*, Carsten Stoll*
[pdf]
[DOI]

Omni-Recon: Harnessing Image-based Rendering for General-Purpose Neural Radiance Fields
Yonggan Fu, Huaizhi Qu, Zhifan Ye, Chaojian Li, Kevin Zhao, Yingyan (Celine) Lin*
[pdf]
[DOI]

MVDiffHD: A Dense High-resolution Multi-view Diffusion Model for Single or Sparse-view 3D Object Reconstruction
Shitao Tang*, Jiacheng Chen, Dilin Wang, Chengzhou Tang, Fuyang Zhang, Yuchen Fan, Vikas Chandra, Yasutaka Furukawa, Rakesh Ranjan
[pdf]
[DOI]

FlowCon: Out-of-Distribution Detection using Flow-based Contrastive Learning
Saandeep Aathreya*, Shaun Canavan*
[pdf]
[DOI]

LEIA: Latent View-invariant Embeddings for Implicit 3D Articulation
Archana Swaminathan*, Anubhav Gupta, Kamal Gupta, Shishira R Maiya, Vatsal Agarwal, Abhinav Shrivastava
[pdf]
[DOI]

Un-EVIMO: Unsupervised Event-based Independent Motion Segmentation
Ziyun Wang*, Jinyuan Guo, Kostas Daniilidis
[pdf]
[DOI]

Seeing the Unseen: A Frequency Prompt Guided Transformer for Image Restoration
Shihao Zhou, Jinshan Pan, Jinglei Shi*, Duosheng Chen, Lishen Qu, Jufeng Yang
[pdf]
[DOI]

CityGaussian: Real-time High-quality Large-Scale Scene Rendering with Gaussians
Yang Liu, Chuanchen Luo, Lue Fan, Naiyan Wang, Junran Peng*, Zhaoxiang Zhang*
[pdf]
[DOI]

Bayesian Evidential Deep Learning for Online Action Detection
Hongji Guo, Hanjing Wang, Qiang Ji*
[pdf]
[DOI]

AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation
Zanlin Ni, Yulin Wang, Renping Zhou, Rui Lu, Jiayi Guo, Jinyi Hu, Zhiyuan Liu, Yuan Yao*, Gao Huang*
[pdf]
[DOI]

Rethinking Data Augmentation for Robust LiDAR Semantic Segmentation in Adverse Weather
Junsung Park, Kyungmin Kim, Hyunjung Shim*
[pdf]
[DOI]

Diffusion-Generated Pseudo-Observations for High-Quality Sparse-View Reconstruction
Xinhang Liu*, Jiaben Chen, Shiu-Hong Kao, Yu-Wing Tai, Chi-Keung Tang
[pdf]
[DOI]

Memory-Efficient Fine-Tuning for Quantized Diffusion Model
Hyogon Ryu, Seohyun Lim, Hyunjung Shim*
[pdf]
[DOI]

VCD-Texture: Variance Alignment based 3D-2D Co-Denoising for Text-Guided Texturing
Shang Liu*, Chaohui Yu, Chenjie Cao, Wen Qian, Fan Wang*
[pdf]
[DOI]

MotionLCM: Real-time Controllable Motion Generation via Latent Consistency Model
Wenxun Dai, Ling-Hao Chen, Jingbo Wang*, Jinpeng Liu, Bo Dai*, Yansong Tang
[pdf]
[DOI]

Human Hair Reconstruction with Strand-Aligned 3D Gaussians
Egor Zakharov*, Vanessa Sklyarova, Michael J. Black, Giljoo Nam, Justus Thies, Otmar Hilliges
[pdf]
[DOI]

COIN: Control-Inpainting Diffusion Prior for Human and Camera Motion Estimation
Jiefeng Li*, Ye Yuan, Davis Rempe, Haotian Zhang, Pavlo Molchanov, Cewu Lu, Jan Kautz, Umar Iqbal*
[pdf]
[DOI]

SA-DVAE: Improving Zero-Shot Skeleton-Based Action Recognition by Disentangled Variational Autoencoders
Sheng-Wei Li, Zi-Xiang Wei, Wei-Jie Chen, Yi-Hsin Yu, Chih-Yuan Yang*, Jane Yung-jen Hsu*
[pdf]
[DOI]

Bridge Past and Future: Overcoming Information Asymmetry in Incremental Object Detection
Qijie Mo, Yipeng Gao, Shenghao Fu, Junkai Yan, Ancong Wu*, Wei-Shi Zheng*
[pdf]
[DOI]

Global-to-Pixel Regression for Human Mesh Recovery
Yabo Xiao, Mingshu HE*, Dongdong Yu
[pdf]
[DOI]

Visible and Clear: Finding Tiny Objects in Difference Map
Bing Cao, Haiyu Yao, Pengfei Zhu*, Qinghua Hu
[pdf]
[DOI]

Rethinking Image Super Resolution from Training Data Perspectives
Go Ohtani*, Ryu Tadokoro, Ryosuke Yamada, Yuki M Asano, Iro Laina, Christian Rupprecht, Nakamasa Inoue, Rio Yokota, Hirokatsu Kataoka, Yoshimitsu Aoki
[pdf]
[DOI]

BlazeBVD: Make Scale-Time Equalization Great Again for Blind Video Deflickering
Xinmin Qiu, Congying Han, Zicheng Zhang, Bonan Li*, Tiande Guo, Pingyu Wang, Xuecheng Nie
[pdf]
[DOI]

Efficient Inference of Vision Instruction-Following Models with Elastic Cache
Zuyan Liu, Benlin Liu, Jiahui Wang, Yuhao Dong, Guangyi Chen, Yongming Rao, Ranjay Krishna, Jiwen Lu*
[pdf]
[DOI]

FreeCompose: Generic Zero-Shot Image Composition with Diffusion Prior
Zhekai Chen, Wen Wang, Zhen Yang, Zeqing Yuan, Hao Chen*, Chunhua Shen*
[pdf]
[DOI]

Learning to Robustly Reconstruct Dynamic Scenes from Low-light Spike Streams
Liwen Hu*, Ziluo Ding, Mianzhi Liu, Lei Ma*, Tiejun Huang
[pdf]
[DOI]

MarvelOVD: Marrying Object Recognition and Vision-Language Models for Robust Open-Vocabulary Object Detection
Kuo Wang, Lechao Cheng*, Weikai Chen, Pingping Zhang, Liang Lin, Fan Zhou, Guanbin Li*
[pdf]
[DOI]

WildVidFit: Video Virtual Try-On in the Wild via Image-Based Controlled Diffusion Models
Zijian He, Peixin Chen, Guangrun Wang, Guanbin Li*, Philip Torr, Liang Lin
[pdf]
[DOI]

Interactive 3D Object Detection with Prompts
Ruifei Zhang, Xiangru Lin, Wei Zhang, Jincheng Lu, Xuekuan Wang, Xiao Tan, Yingying Li, Errui Ding, Jingdong Wang, Guanbin Li*
[pdf]
[DOI]

How Video Meetings Change Your Expression
Sumit Sarin*, Utkarsh Mall, Purva Tendulkar, Carl Vondrick
[pdf]
[DOI]

GRACE: Graph-Based Contextual Debiasing for Fair Visual Question Answering
Yifeng Zhang, Ming Jiang, Qi Zhao*
[pdf]
[DOI]

Neural Volumetric World Models for Autonomous Driving
Zanming Huang*, Jimuyang Zhang*, Eshed Ohn-Bar*
[pdf]
[DOI]

IVTP: Instruction-guided Visual Token Pruning for Large Vision-Language Models
Kai Huang*, Hao Zou, Ye Xi, Bochen Wang, Zhen Xie, Liang Yu
[pdf]
[DOI]

RegionDrag: Fast Region-Based Image Editing with Diffusion Models
Jingyi Lu, Xinghui Li, Kai Han*
[pdf]
[DOI]

On the Error Analysis of 3D Gaussian Splatting and an Optimal Projection Strategy
Letian Huang, Jiayang Bai, Jie Guo*, Yuanqi Li, Yanwen Guo
[pdf]
[DOI]

Bad Students Make Great Teachers: Active Learning Accelerates Large-Scale Visual Understanding
Talfan Evans*, Shreya Pathak, Hamza Merzic, Jonathan Richard Schwarz, Ryutaro Tanno, Olivier Henaff*
[pdf]
[DOI]

Analytic-Splatting: Anti-Aliased 3D Gaussian Splatting via Analytic Integration
Zhihao Liang*, Qi Zhang*, Wenbo Hu, Ying Feng, Lei ZHU, Kui Jia*
[pdf]
[DOI]

GRA: Detecting Oriented Objects through Group-wise Rotating and Attention
Jiangshan Wang*, Yifan Pu, Yizeng Han, Jiayi Guo, Yiru Wang, Xiu Li*, Gao Huang*
[pdf]
[DOI]

Portrait4D-v2: Pseudo Multi-View Data Creates Better 4D Head Synthesizer
Yu Deng*, Duomin Wang, Baoyuan Wang
[pdf]
[DOI]

CSOT: Cross-Scan Object Transfer for Semi-Supervised LiDAR Object Detection
Jinglin Zhan, Tiejun Liu, Rengang Li, Zhaoxiang Zhang, Yuntao Chen*
[pdf]
[DOI]

Learning from the Web: Language Drives Weakly-Supervised Incremental Learning for Semantic Segmentation
Chang Liu, Giulia Rizzoli, Pietro Zanuttigh, Fu Li, Yi Niu*
[pdf]
[DOI]

ShareGPT4V: Improving Large Multi-Modal Models with Better Captions
Lin Chen*, Jinsong Li, Xiaoyi Dong, Pan Zhang, Conghui He, Jiaqi Wang, Feng Zhao*, Dahua Lin*
[pdf]
[DOI]

"Eyes Closed, Safety On: Protecting Multimodal LLMs via Image-to-Text Transformation"
Yunhao Gou*, Kai Chen, Zhili LIU, Lanqing Hong, Hang Xu, Zhenguo Li, Dit-Yan Yeung, James Kwok, Yu Zhang*
[pdf]
[DOI]

Invertible Neural Warp for NeRF
Shin-Fang Chng*, Ravi Garg, Hemanth Saratchandran, Simon Lucey
[pdf]
[DOI]

Enhancing Vectorized Map Perception with Historical Rasterized Maps
Xiaoyu Zhang, Guangwei Liu, Zihao Liu, Ningyi Xu, Yunhui Liu*, Ji Zhao
[pdf]
[DOI]

Efficient and Versatile Robust Fine-Tuning of Zero-shot Models
Sungyeon Kim*, Boseung Jeong, Donghyun Kim, Suha Kwak*
[pdf]
[DOI]

Part2Object: Hierarchical Unsupervised 3D Instance Segmentation
Cheng Shi, Yulin Zhang, Bin Yang, Jiajin Tang, Yuexin Ma, Sibei Yang*
[pdf]
[DOI]

PetFace: A Large-Scale Dataset and Benchmark for Animal Identification
Risa Shinoda*, Kaede Shiohara
[pdf]
[DOI]

MVSGaussian: Fast Generalizable Gaussian Splatting Reconstruction from Multi-View Stereo
Tianqi Liu, Guangcong Wang, Shoukang Hu, Liao Shen, Xinyi Ye, Yuhang Zang, Zhiguo Cao*, Wei Li, Ziwei Liu
[pdf]
[DOI]

Zero-Shot Detection of AI-Generated Images
Davide Cozzolino, GIovanni Poggi, Matthias Niessner, Luisa Verdoliva*
[pdf]
[DOI]

Language-Image Pre-training with Long Captions
Kecheng Zheng*, Yifei Zhang, Wei Wu, Fan Lu, Shuailei Ma, Xin Jin, Wei Chen, Yujun Shen
[pdf]
[DOI]

GKGNet: Group K-Nearest Neighbor based Graph Convolutional Network for Multi-Label Image Recognition
Ruijie Yao, Sheng Jin, Lumin Xu, Wang Zeng, Wentao Liu, Chen Qian*, Ping Luo, Ji Wu*
[pdf]
[DOI]

DISCO: Embodied Navigation and Interaction via Differentiable Scene Semantics and Dual-level Control
Xinyu Xu*, Shengcheng Luo, Yanchao Yang, Yong-Lu Li*, Cewu Lu*
[pdf]
[DOI]

You Only Learn One Query: Learning Unified Human Query for Single-Stage Multi-Person Multi-Task Human-Centric Perception
Sheng Jin, Shuhuai Li, Tong Li, Wentao Liu*, Chen Qian, Ping Luo*
[pdf]
[DOI]

Towards Real-World Adverse Weather Image Restoration: Enhancing Clearness and Semantics with Vision-Language Models
Jiaqi Xu*, Mengyang Wu, Xiaowei Hu*, Chi-Wing Fu, Qi Dou, Pheng-Ann Heng
[pdf]
[DOI]

Facial Affective Behavior Analysis with Instruction Tuning
Yifan Li*, Anh Dao, Wentao Bao, Zhen Tan, Tianlong Chen, Huan Liu, Yu Kong
[pdf]
[DOI]

CoReS: Orchestrating the Dance of Reasoning and Segmentation
Xiaoyi Bao, Siyang Sun, Shuailei Ma, Kecheng Zheng, Yuxin Guo, Guosheng Zhao, Yun Zheng, Xingang Wang*
[pdf]
[DOI]

MagDiff: Multi-Alignment Diffusion for High-Fidelity Video Generation and Editing
Haoyu Zhao, Tianyi Lu, Jiaxi Gu, Xing Zhang, Qingping Zheng, Zuxuan Wu*, Hang Xu, Yu-Gang Jiang
[pdf]
[DOI]

MambaIR: A Simple Baseline for Image Restoration with State-Space Model
Hang Guo*, Jinmin Li, Tao Dai*, Zhihao Ouyang, Xudong Ren, Shu-Tao Xia
[pdf]
[DOI]

I Can't Believe It's Not Scene Flow!
Ishan Khatri*, Kyle Vedder*, Neehar Peri, Deva Ramanan, James Hays
[pdf]
[DOI]

Rethinking Unsupervised Outlier Detection via Multiple Thresholding
Zhonghang Liu*, Panzhong Lu, Guoyang Xie, Zhichao Lu, Wen-Yan Lin
[pdf]
[DOI]

Compress3D: a Compressed Latent Space for 3D Generation from a Single Image
Bowen Zhang*, Tianyu Yang*, Yu Li, Lei Zhang, Xi Zhao*
[pdf]
[DOI]

Scalable Group Choreography via Variational Phase Manifold Learning
Nhat Le, Khoa Do, Xuan Bui, Tuong Do, Erman Tjiputra, Quang D.Tran, Anh Nguyen*
[pdf]
[DOI]

Masked Video and Body-worn IMU Autoencoder for Egocentric Action Recognition
Mingfang Zhang, Yifei Huang*, Ruicong Liu, Yoichi Sato
[pdf]
[DOI]

Mutual Learning for Acoustic Matching and Dereverberation via Visual Scene-driven Diffusion
Jian Ma, Wenguan Wang*, Yi Yang, Feng Zheng
[pdf]
[DOI]

PoseSOR: Human Pose Can Guide Our Attention
Huankang Guan, Rynson W.H. Lau*
[pdf]
[DOI]

TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes
Bu Jin, Yupeng Zheng*, Pengfei Li, Weize Li, Yuhang Zheng, Sujie Hu, Xinyu Liu, Jinwei Zhu, Zhijie Yan, Haiyang Sun, Kun Zhan, Peng Jia, Xiaoxiao Long, Yilun Chen, Hao Zhao
[pdf]
[DOI]

Bi-directional Contextual Attention for 3D Dense Captioning
Minjung Kim*, Hyung Suk Lim, Soonyoung Lee, Bumsoo Kim*, Gunhee Kim*
[pdf]
[DOI]

Multi-Person Pose Forecasting with Individual Interaction Perceptron and Prior Learning
Peng Xiao, Yi Xie, Xuemiao Xu*, Weihong Chen, Huaidong Zhang*
[pdf]
[DOI]

InfMAE: A Foundation Model in The Infrared Modality
Fangcen Liu, Chenqiang Gao*, Yaming Zhang, Junjie Guo, Jinghao Wang, Deyu Meng
[pdf]
[DOI]

TPA3D: Triplane Attention for Fast Text-to-3D Generation
Bin-Shih Wu*, Hong-En Chen*, Sheng-Yu Huang, Yu-Chiang Frank Wang
[pdf]
[DOI]

Multi-Memory Matching for Unsupervised Visible-Infrared Person Re-Identification
Jiangming Shi, Xiangbo Yin, Yeyun Chen, Yachao Zhang, Zhizhong Zhang, Yuan Xie*, Yanyun Qu*
[pdf]
[DOI]

LivePhoto: Real Image Animation with Text-guided Motion Control
Xi Chen, Zhiheng Liu, Mengting Chen, Yutong Feng, Yu Liu, Yujun Shen, Hengshuang Zhao*
[pdf]
[DOI]

"NeuSDFusion: A Spatial-Aware Generative Model for 3D Shape Completion, Reconstruction, and Generation"
Ruikai Cui, Weizhe Liu*, Weixuan Sun, Senbo Wang, Taizhang Shang, Yang Li, Xibin Song, Han Yan, ZHENNAN WU, Shenzhou Chen, HONGDONG LI, Pan Ji
[pdf]
[DOI]

AID-AppEAL: Automatic Image Dataset and Algorithm for Content Appeal Enhancement and Assessment Labeling
Sherry X. Chen*, Yaron Vaxman, Elad Ben Baruch, David Asulin, Aviad Moreshet, Misha Sra, Pradeep Sen
[pdf]
[DOI]

SEDiff: Structure Extraction for Domain Adaptive Depth Estimation via Denoising Diffusion Models
Dongseok Shim*, Hyoun Jin Kim*
[pdf]
[DOI]

Quantized Prompt for Efficient Generalization of Vision-Language Models
Tianxiang Hao, Xiaohan Ding*, Juexiao Feng, Yuhong Yang, Hui Chen, Guiguang Ding*
[pdf]
[DOI]

Online Temporal Action Localization with Memory-Augmented Transformer
Youngkil Song, Dongkeun Kim, Minsu Cho, Suha Kwak*
[pdf]
[DOI]

Efficient Cascaded Multiscale Adaptive Network for Image Restoration
Yichen Zhou*, Pan Zhou*, Teck Khim Ng
[pdf]
[DOI]

MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model
Muyao Niu, Xiaodong Cun*, Xintao Wang, Yong Zhang, Ying Shan, Yinqiang Zheng*
[pdf]
[DOI]

Occlusion-Aware Seamless Segmentation
Yihong Cao, Jiaming Zhang, Hao Shi, Kunyu Peng, Yuhongxuan Zhang, Hui Zhang*, Rainer Stiefelhagen, Kailun Yang*
[pdf]
[DOI]

OpenKD: Opening Prompt Diversity for Zero- and Few-shot Keypoint Detection
Changsheng Lu*, Zheyuan Liu, Piotr Koniusz*
[pdf]
[DOI]

Referring Atomic Video Action Recognition
Kunyu Peng*, Jia Fu, Kailun Yang, Di Wen, Yufan Chen, Ruiping Liu, Junwei Zheng, Jiaming Zhang, Saquib Sarfraz, Rainer Stiefelhagen, Alina Roitberg
[pdf]
[DOI]

Agent3D-Zero: An Agent for Zero-shot 3D Understanding
sha zhang, Di Huang, Jiajun Deng*, Shixiang Tang, Wanli Ouyang, Tong He*, Yanyong Zhang*
[pdf]
[DOI]

Stream Query Denoising for Vectorized HD-Map Construction
Shuo Wang*, Fan Jia, Weixin Mao, Yingfei Liu, Yucheng Zhao, Zehui Chen, Tiancai Wang, Chi Zhang, Xiangyu Zhang, Feng Zhao*
[pdf]
[DOI]

SAGS: Structure-Aware 3D Gaussian Splatting
Evangelos Ververas, Rolandos Alexandros Potamias*, Jifei Song, Jiankang Deng, Stefanos Zafeiriou
[pdf]
[DOI]

Spherical Linear Interpolation and Text-Anchoring for Zero-shot Composed Image Retrieval
Young Kyun Jang*, Dat B Huynh, Ashish Shah, Wen-Kai Chen, Ser-Nam Lim*
[pdf]
[DOI]

OneRestore: A Universal Restoration Framework for Composite Degradation
Yu Guo*, Yuan Gao, Yuxu Lu, Huilin Zhu, Wen Liu, Shengfeng He
[pdf]
[DOI]

Beat-It: Beat-Synchronized Multi-Condition 3D Dance Generation
Zikai Huang, Xuemiao Xu*, Cheng Xu*, Huaidong Zhang, Chenxi Zheng, Jing Qin, Shengfeng He
[pdf]
[DOI]

SkyMask: Attack-agnostic Robust Federated Learning with Fine-grained Learnable Masks
Peishen Yan, Hao Wang, Tao Song*, Yang Hua, Ruhui Ma, Ningxin Hu, Mohammad Reza Haghighat, Haibing Guan
[pdf]
[DOI]

RePOSE: 3D Human Pose Estimation via Spatio-Temporal Depth Relational Consistency
Ziming Sun, Yuan Liang, Zejun Ma, Tianle Zhang, Linchao Bao, Guiqing Li, Shengfeng He*
[pdf]
[DOI]

Pixel-GS Density Control with Pixel-aware Gradient for 3D Gaussian Splatting
Zheng Zhang, Wenbo Hu*, Yixing Lao, Tong He, Hengshuang Zhao*
[pdf]
[DOI]

WorldPose: A World Cup Dataset for Global 3D Human Pose Estimation
Tianjian Jiang*, Johsan Billingham, Sebastian Müksch, Juan J Zarate, Nicolas Evans, Martin R. Oswald, Marc Pollefeys, Otmar Hilliges, Manuel Kaufmann, Jie Song
[pdf]
[DOI]

Language-Driven 6-DoF Grasp Detection Using Negative Prompt Guidance
Toan Nguyen, Minh Nhat Nhat Vu, Baoru Huang, An Dinh Vuong, Quan Vuong, Ngan Le, Thieu Vo, Anh Nguyen*
[pdf]
[DOI]

COIN-Matting: Confounder Intervention for Image Matting
Zhaohe Liao, Jiangtong Li, Jun Lan, Huijia Zhu, Weiqiang Wang, Li Niu*, Liqing Zhang*
[pdf]
[DOI]

SHINE: Saliency-aware HIerarchical NEgative Ranking for Compositional Temporal Grounding
Zixu Cheng*, Yujiang Pu*, Shaogang Gong, Parisa Kordjamshidi, Yu Kong
[pdf]
[DOI]

Audio-driven Talking Face Generation with Stabilized Synchronization Loss
Dogucan Yaman*, Fevziye Irem Eyiokur, Leonard Bärmann, HAZIM KEMAL EKENEL, Alexander Waibel
[pdf]
[DOI]

"Propose, Assess, Search: Harnessing LLMs for Goal-Oriented Planning in Instructional Videos"
Md Mohaiminul Islam*, Tushar Nagarajan, Huiyu Wang, FU-JEN CHU, Kris Kitani, Gedas Bertasius, Xitong Yang
[pdf]
[DOI]

Train Till You Drop: Towards Stable and Robust Source-free Unsupervised 3D Domain Adaptation
Björn Michele*, Alexandre Boulch, Tuan-Hung VU, Gilles Puy, Renaud Marlet, Nicolas Courty
[pdf]
[DOI]

Learning to Obstruct Few-Shot Image Classification over Restricted Classes
Amber Yijia Zheng*, Chiao-An Yang*, Raymond A. Yeh
[pdf]
[DOI]

RoofDiffusion: Constructing Roofs from Severely Corrupted Point Data via Diffusion
Kyle Shih-Huang Lo*, Jorg Peters, Eric Spellman
[pdf]
[DOI]

L-DiffER: Single Image Reflection Removal with Language-based Diffusion Model
Yuchen Hong*, Haofeng Zhong*, Shuchen Weng, Jinxiu S Liang, Boxin Shi
[pdf]
[DOI]

AdaShield: Safeguarding Multimodal Large Language Models from Structure-based Attack via Adaptive Shield Prompting
Yu Wang*, Xiaogeng Liu*, Yu Li*, Muhao Chen, Chaowei Xiao*
[pdf]
[DOI]

OccGen: Generative Multi-modal 3D Occupancy Prediction for Autonomous Driving
Guoqing Wang, Zhongdao Wang, Pin Tang, Jilai Zheng, Xiangxuan Ren, Bailan Feng, Chao Ma*
[pdf]
[DOI]

CrossGLG: LLM Guides One-shot Skeleton-based 3D Action Recognition in a Cross-level Manner
Tingbing Yan, Wenzheng Zeng*, Yang Xiao*, Xingyu Tong, Bo Tan, Zhiwen Fang, Zhiguo Cao, Joey Tianyi Zhou
[pdf]
[DOI]

HYDRA: A Hyper Agent for Dynamic Compositional Visual Reasoning
Fucai Ke*, Zhixi Cai, Simindokht Jahangard, Weiqing Wang, Pari Delir Haghighi, Hamid Rezatofighi
[pdf]
[DOI]

BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion
Xuan Ju*, Xian Liu, Xintao Wang*, Yuxuan Bian, Ying Shan, Qiang Xu*
[pdf]
[DOI]

LayoutDETR: Detection Transformer Is a Good Multimodal Layout Designer
Ning Yu*, Chia-chih Chen, Zeyuan Chen, Rui Meng, Gang Wu, Paul W Josel, Juan Carlos Niebles, Caiming Xiong, Ran Xu
[pdf]
[DOI]

Blind image deblurring with noise-robust kernel estimation
Chanseok Lee*, Jeongsol Kim, Seungmin Lee, Jaehwang Jung, Yunje Cho, Taejoong Kim, Taeyong Jo, Myungjun Lee, Mooseok Jang*
[pdf]
[DOI]

Binomial Self-compensation for Motion Error in Dynamic 3D Scanning
Geyou Zhang, Ce Zhu*, Kai Liu
[pdf]
[DOI]

AddMe: Zero-shot Group-photo Synthesis by Inserting People into Scenes
Dongxu Yue, Maomao Li, Yunfei Liu, Ailing Zeng, Tianyu Yang, Qin Guo, Yu Li*
[pdf]
[DOI]

Distill Gold from Massive Ores: Bi-level Data Pruning towards Efficient Dataset Distillation
Yue Xu, Yong-Lu Li*, Kaitong Cui, Ziyu Wang, Cewu Lu, Yu-Wing Tai, Chi-Keung Tang
[pdf]
[DOI]

VersatileGaussian: Real-time Neural Rendering for Versatile Tasks using Gaussian Splatting
Renjie Li, Zhiwen Fan*, Bohua Wang, Peihao Wang, Zhangyang Wang, Xi Wu
[pdf]
[DOI]

Momentum Auxiliary Network for Supervised Local Learning
Junhao Su, Changpeng Cai, Feiyu Zhu, Chenghao He, Xiaojie Xu, Dongzhi Guan*, Chenyang Si*
[pdf]
[DOI]

HPFF: Hierarchical Locally Supervised Learning with Patch Feature Fusion
Junhao Su, Chenghao He, Feiyu Zhu, Xiaojie Xu, Dongzhi Guan, Chenyang Si*
[pdf]
[DOI]

Rethinking LiDAR Domain Generalization: Single Source as Multiple Density Domains
Jaeyeul Kim, Jungwan Woo, Jeonghoon Kim, Sunghoon Im*
[pdf]
[DOI]

Improving Zero-Shot Generalization for CLIP with Variational Adapter
Ziqian Lu, Fengli Shen, Mushui Liu, Yunlong Yu*, Xi Li
[pdf]
[DOI]

Realistic Human Motion Generation with Cross-Diffusion Models
Zeping Ren, Shaoli Huang*, Xiu Li*
[pdf]
[DOI]

EgoExo-Fitness: Towards Egocentric and Exocentric Full-Body Action Understanding
Yuan-Ming Li, Wei-Jin Huang, An-Lan Wang, Ling-An Zeng, Jing-Ke Meng*, Wei-Shi Zheng*
[pdf]
[DOI]

Any Target Can be Offense: Adversarial Example Generation via Generalized Latent Infection
Youheng Sun, Shengming Yuan, Xuanhan Wang*, Lianli Gao, Jingkuan Song
[pdf]
[DOI]

Towards Reliable Advertising Image Generation Using Human Feedback
Zhenbang Du*, Wei Feng, Haohan Wang, Yaoyu Li, Jingsen Wang, Jian Li, Zheng Zhang, Jingjing Lv, Xin Zhu, Junsheng Jin, Junjie Shen, Zhangang Lin, Jingping Shao
[pdf]
[DOI]

Topology-Preserving Downsampling of Binary Images
Chia-Chia Chen*, Chi-Han Peng*
[pdf]
[DOI]

ColorMAE: Exploring data-independent masking strategies in Masked AutoEncoders
Carlos Hinojosa*, Shuming Liu, Bernard Ghanem
[pdf]
[DOI]

Classification Matters: Improving Video Action Detection with Class-Specific Attention
Jinsung Lee, Taeoh Kim, Inwoong Lee, Minho Shim, Dongyoon Wee, Minsu Cho, Suha Kwak*
[pdf]
[DOI]

Improving Medical Multi-modal Contrastive Learning with Expert Annotations
Yogesh Kumar*, Pekka Marttinen
[pdf]
[DOI]

Rethinking Data Bias: Dataset Copyright Protection via Embedding Class-wise Hidden Bias
Jinhyeok Jang*, ByungOk Han, Jaehong Kim, Chan-Hyun Youn
[pdf]
[DOI]

Pose-Aware Self-Supervised Learning with Viewpoint Trajectory Regularization
Jiayun Wang*, Yubei Chen, Stella X. Yu
[pdf]
[DOI]

SILC: Improving Vision Language Pretraining with Self-Distillation
Muhammad Ferjad Naeem*, Yongqin Xian, Xiaohua Zhai, Lukas Hoyer, Luc Van Gool, Federico Tombari
[pdf]
[DOI]

Learning Semantic Latent Directions for Accurate and Controllable Human Motion Prediction
Guowei Xu, Jiale Tao, Wen Li*, Lixin Duan
[pdf]
[DOI]

Leveraging temporal contextualization for video action recognition
Minji Kim, Dongyoon Han, Taekyung Kim*, Bohyung Han*
[pdf]
[DOI]

ChEX: Interactive Localization and Region Description in Chest X-rays
Philip Müller*, Georgios Kaissis, Daniel Rueckert
[pdf]
[DOI]

AdaGlimpse: Active Visual Exploration with Arbitrary Glimpse Position and Scale
Adam Pardyl*, Michał Wronka, Maciej Wołczyk, Kamil Adamczewski, Tomasz Trzcinski, Bartosz Zieliński*
[pdf]
[DOI]

CLAP: Isolating Content from Style through Contrastive Learning with Augmented Prompts
Yichao Cai*, Yuhang Liu, Zhen Zhang, Javen Qinfeng Shi
[pdf]
[DOI]

ZigMa: A DiT-style Zigzag Mamba Diffusion Model
Vincent Tao Hu*, Stefan A Baumann, Ming Gui, Olga Grebenkova, Pingchuan Ma, Johannes S Fischer, Bjorn Ommer
[pdf]
[DOI]

EchoScene: Indoor Scene Generation via Information Echo over Scene Graph Diffusion
Guangyao Zhai*, Evin Pınar Örnek, Dave Zhenyu Chen, Ruotong Liao, Yan Di, Nassir Navab, Federico Tombari, Benjamin Busam
[pdf]
[DOI]

"On Calibration of Object Detectors: Pitfalls, Evaluation and Baselines"
Selim Kuzucu*, Kemal Oksuz*, Jonathan Sadeghi, Puneet Dokania
[pdf]
[DOI]

HAT: History-Augmented Anchor Transformer for Online Temporal Action Localization
Sakib Reza, Yuexi Zhang, Mohsen Moghaddam, Octavia Camps*
[pdf]
[DOI]

Deep Nets with Subsampling Layers Unwittingly Discard Useful Activations at Test-Time
Chiao-An Yang*, Ziwei Liu, Raymond Yeh
[pdf]
[DOI]

Safe-Sim: Safety-Critical Closed-Loop Traffic Simulation with Diffusion-Controllable Adversaries
Wei-Jer Chang*, Francesco Pittaluga, Masayoshi Tomizuka, Wei Zhan, Manmohan Chandraker
[pdf]
[DOI]

Analysis-by-Synthesis Transformer for Single-View 3D Reconstruction
Dian Jia, Xiaoqian Ruan, Kun Xia, Zhiming Zou, Le Wang, Wei Tang*
[pdf]
[DOI]

Challenging Forgets: Unveiling the Worst-Case Forget Sets in Machine Unlearning
Chongyu Fan, Jiancheng Liu*, Alfred Hero, Sijia Liu
[pdf]
[DOI]

WaSt-3D: Wasserstein-2 Distance for Scene-to-Scene Stylization on 3D Gaussians
Dmytro Kotovenko*, Olga Grebenkova*, Nikolaos Sarafianos, Avinash Paliwal, Pingchuan Ma, Omid Poursaeed, Sreyas Mohan, Yuchen Fan, Yilei Li, Rakesh Ranjan, Bjorn Ommer
[pdf]
[DOI]

SCLIP: Rethinking Self-Attention for Dense Vision-Language Inference
Feng Wang*, Jieru Mei, Alan Yuille
[pdf]
[DOI]

Flying with Photons: Rendering Novel Views of Propagating Light
Anagh Malik*, Noah Juravsky, Ryan Po, Gordon Wetzstein, Kiriakos N. Kutulakos, David B. Lindell
[pdf]
[DOI]

RGNet: A Unified Clip Retrieval and Grounding Network for Long Videos
Tanveer Hannan*, Md Mohaiminul Islam, Thomas Seidl, Gedas Bertasius
[pdf]
[DOI]

MVSplat: Efficient 3D Gaussian Splatting from Sparse Multi-View Images
Yuedong Chen*, Haofei Xu, Chuanxia Zheng, Bohan Zhuang, Marc Pollefeys, Andreas Geiger, Tat-Jen Cham, Jianfei Cai
[pdf]
[DOI]

3DGazeNet: Generalizing Gaze Estimation with Weak Supervision from Synthetic Views
Evangelos Ververas*, Polydefkis Gkagkos, Jiankang Deng, Michail C Doukas, Jia Guo, Stefanos Zafeiriou
[pdf]
[DOI]

Removing Distributional Discrepancies in Captions Improves Image-Text Alignment
Mu Cai, Haotian Liu, Yuheng Li*, Yijun Li, Eli Shechtman, Zhe Lin, Yong Jae Lee, Krishna Kumar Singh
[pdf]
[DOI]

Resilience of Entropy Model in Distributed Neural Networks
Milin Zhang*, Mohammad Abdi, Shahriar Rifat, Francesco Restuccia
[pdf]
[DOI]

Rejection Sampling IMLE: Designing Priors for Better Few-Shot Image Synthesis
Chirag Vashist*, Shichong Peng, Ke Li
[pdf]
[DOI]

Implicit Concept Removal of Diffusion Models
Zhili Liu*, Kai Chen, Yifan Zhang, Jianhua Han, Lanqing Hong, Hang Xu, Zhenguo Li, Dit-Yan Yeung, James Kwok
[pdf]
[DOI]

PLOT: Text-based Person Search with Part Slot Attention for Corresponding Part Discovery
Jicheol Park, Dongwon Kim, Boseung Jeong, Suha Kwak*
[pdf]
[DOI]

GS-LRM: Large Reconstruction Model for 3D Gaussian Splatting
Kai Zhang*, Sai Bi, Hao Tan, Yuanbo Xiangli, Nanxuan Zhao, Kalyan Sunkavalli, Zexiang Xu
[pdf]
[DOI]

Robust-Wide: Robust Watermarking against Instruction-driven Image Editing
Runyi Hu, Jie Zhang*, Ting Xu, Jiwei Li, Tianwei Zhang
[pdf]
[DOI]

OAPT: Offset-Aware Partition Transformer for Double JPEG Artifacts Removal
Qiao Mo, Yukang Ding, Jinhua Hao*, Qiang Zhu, Ming Sun, Chao Zhou, Feiyu Chen, Shuyuan Zhu*
[pdf]
[DOI]

Formula-Supervised Visual-Geometric Pre-training
Ryosuke Yamada*, Kensho Hara*, Hirokatsu Kataoka, Koshi Makihara, Nakamasa Inoue, Rio Yokota, Yutaka Satoh
[pdf]
[DOI]

VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding
Yue Fan, Xiaojian Ma*, Rujie Wu, yuntao du, Jiaqi Li, Zhi Gao, Qing Li*
[pdf]
[DOI]

Towards Unified Representation of Invariant-Specific Features in Missing Modality Face Anti-Spoofing
Guanghao Zheng, Yuchen Liu, Wenrui Dai*, Chenglin Li, Junni Zou, Hongkai Xiong
[pdf]
[DOI]

Restoring Images in Adverse Weather Conditions via Histogram Transformer
Shangquan Sun, Wenqi Ren*, Xinwei Gao, Rui Wang, Xiaochun Cao
[pdf]
[DOI]

PosFormer: Recognizing Complex Handwritten Mathematical Expression with Position Forest Transformer
Tongkun Guan, Chengyu Lin, Wei Shen*, Xiaokang Yang
[pdf]
[DOI]

NGP-RT: Fusing Multi-Level Hash Features with Lightweight Attention for Real-Time Novel View Synthesis
Yubin Hu, Xiaoyang Guo, Yang Xiao, Jingwei Huang, Yong-Jin Liu*
[pdf]
[DOI]

Elysium: Exploring Object-level Perception in Videos through Semantic Integration Using MLLMs
Han Wang*, Yanjie Wang, Ye Yongjie, Yuxiang Nie, Can Huang
[pdf]
[DOI]

G2fR: Frequency Regularization in Grid-based Feature Encoding Neural Radiance Fields
Shuxiang Xie*, Shuyi Zhou, Ken Sakurada, Ryoichi Ishikawa, Masaki Onishi, Takeshi Oishi
[pdf]
[DOI]

Getting it Right: Improving Spatial Consistency in Text-to-Image Models
Agneet Chatterjee*, Gabriela Ben Melech Stan, Estelle Guez Aflalo, Sayak Paul, Dhruba Ghosh, Tejas Gokhale, Ludwig Schmidt, Hanna Hajishirzi, Vasudev Lal, Chitta R Baral, Yezhou Yang
[pdf]
[DOI]

Generating 3D House Wireframes with Semantics
Xueqi Ma, Yilin Liu, Wenjun Zhou, Ruowei Wang, Hui Huang*
[pdf]
[DOI]

GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image
Xiao Fu*, Wei Yin, Mu Hu, Kaixuan Wang, Yuexin Ma, Ping Tan, Shaojie Shen, Dahua Lin, Xiaoxiao Long
[pdf]
[DOI]

Shape-guided Configuration-aware Learning for Endoscopic-image-based Pose Estimation of Flexible Robotic Instruments
Yiyao Ma*, Kai Chen*, Hon-Sing Tong, Ruofeng Wei, Yui-Lun Ng, Ka-Wai Kwok*, Qi Dou*
[pdf]
[DOI]

Nonverbal Interaction Detection
Jianan Wei, Tianfei Zhou, Yi Yang, Wenguan Wang*
[pdf]
[DOI]

UniM2AE: Multi-modal Masked Autoencoders with Unified 3D Representation for 3D Perception in Autonomous Driving
Jian Zou, Tianyu Huang, Guanglei Yang*, Zhenhua Guo, Tao Luo*, Chun-Mei Feng, Wangmeng Zuo
[pdf]
[DOI]

Responsible Visual Editing
Minheng Ni, Yeli Shen, Lei Zhang*, Wangmeng Zuo*
[pdf]
[DOI]

Drag Anything: Motion Control for Anything using Entity Representation
Weijia Wu , Zhuang Li, Yuchao Gu, Rui Zhao, Yefei He, David Junhao Zhang, Mike Zheng Shou*, Yan Li, Tingting Gao, Zhang Di
[pdf]
[DOI]

SegPoint: Segment Any Point Cloud via Large Language Model
Shuting He, Henghui Ding, Xudong Jiang, Bihan Wen*
[pdf]
[DOI]

Navigation Instruction Generation with BEV Perception and Large Language Models
Sheng Fan, Rui Liu, Wenguan Wang*, Yi Yang
[pdf]
[DOI]

Rebalancing Using Estimated Class Distribution for Imbalanced Semi-Supervised Learning under Class Distribution Mismatch
Taemin Park, Hyuck Lee, Heeyoung Kim*
[pdf]
[DOI]

Vista3D: unravel the 3d darkside of a single image
Qiuhong Shen, Xingyi Yang, Michael Bi Mi, Xinchao Wang*
[pdf]
[DOI]

The Fabrication of Reality and Fantasy: Scene Generation with LLM-Assisted Prompt Interpretation
Yi Yao, Chan-Feng Hsu*, Jhe-Hao Lin, Hongxia Xie, Terence Lin, Yi-Ning Huang, Hong-Han Shuai*, Wen-Huang Cheng*
[pdf]
[DOI]

Detecting As Labeling: Rethinking LiDAR-camera Fusion in 3D Object Detection
Junjie Huang*, Yun Ye, Zhujin Liang, Yi Shan, Dalong Du
[pdf]
[DOI]

FlashSplat: 2D to 3D Gaussian Splatting Segmentation Solved Optimally
Qiuhong Shen, Xingyi Yang, Xinchao Wang*
[pdf]
[DOI]

Exploiting Dual-Correlation for Multi-frame Time-of-Flight Denoising
Guanting Dong*, Yueyi Zhang*, Xiaoyan Sun, Zhiwei Xiong
[pdf]
[DOI]

Weak-to-Strong Compositional Learning from Generative Models for Language-based Object Detection
Kwanyong Park, Kuniaki Saito, Donghyun Kim*
[pdf]
[DOI]

Domesticating SAM for Breast Ultrasound Image Segmentation via Spatial-frequency Fusion and Uncertainty Correction
Wanting Zhang, Huisi Wu*, Jing Qin
[pdf]
[DOI]

CanonicalFusion: Generating Drivable 3D Human Avatars from Multiple Images
Jisu Shin, Junmyeong Lee, Seongmin Lee, Min-Gyu Park, Jumi Kang, Ju Hong Yoon, Hae-Gon Jeon*
[pdf]
[DOI]

Camera Height Doesn't Change: Unsupervised Training for Metric Monocular Road-Scene Depth Estimation
Genki Kinoshita*, Ko Nishino
[pdf]
[DOI]

Uni3DL: A Unified Model for 3D Vision-Language Understanding
Xiang Li*, Jian Ding, Zhaoyang Chen, Mohamed Elhoseiny
[pdf]
[DOI]

Object-Aware NIR-to-Visible Translation
Yunyi Gao, Lin Gu, Qiankun Liu, Ying Fu*
[pdf]
[DOI]

PaPr: Training-Free One-Step Patch Pruning with Lightweight ConvNets for Faster Inference
Tanvir Mahmud*, Burhaneddin Yaman, Chun-Hao Liu, Diana Marculescu
[pdf]
[DOI]

GENIXER: Empowering Multimodal Large Language Models as a Powerful Data Generator
Henry Hengyuan Zhao*, Pan Zhou*, Mike Zheng Shou*
[pdf]
[DOI]

BLINK: Multimodal Large Language Models Can See but Not Perceive
Xingyu Fu*, Yushi Hu*, Bangzheng Li, Yu Feng, Haoyu Wang, Xudong Lin, Dan Roth, Noah A Smith, Wei-Chiu Ma, Ranjay Krishna
[pdf]
[DOI]

AFF-ttention! Affordances and Attention models for Short-Term Object Interaction Anticipation
Lorenzo Mur-Labadia*, Ruben Martinez-Cantin, Jose J Guerrero, Giovanni Maria Farinella, Antonino Furnari
[pdf]
[DOI]

PreLAR: World Model Pre-training with Learnable Action Representation
Lixuan Zhang, Meina Kan, Shiguang Shan, Xilin Chen*
[pdf]
[DOI]

Multi-HMR: Multi-Person Whole-Body Human Mesh Recovery in a Single Shot
Fabien Baradel*, Thomas LUCAS, Matthieu Armando, Salma Galaaoui, Romain Brégier, Philippe Weinzaepfel, Gregory Rogez
[pdf]
[DOI]

De-confounded Gaze Estimation
Ziyang Liang, Yiwei Bao, Feng Lu*
[pdf]
[DOI]

Diffusion Models for Monocular Depth Estimation: Overcoming Challenging Conditions
Fabio Tosi, Pierluigi Zama Ramirez, Matteo Poggi*
[pdf]
[DOI]

FreestyleRet: Retrieving Images from Style-Diversified Queries
Hao Li*, Yanhao Jia, Peng Jin, Zesen Cheng, Kehan Li, Jialu Sui, Chang Liu, Li Yuan*
[pdf]
[DOI]

ReGround: Improving Textual and Spatial Grounding at No Cost
Phillip Y. Lee, Minhyuk Sung*
[pdf]
[DOI]

CardiacNet: Learning to Reconstruct Abnormalities for Cardiac Disease Assessment from Echocardiogram Videos
Jiewen Yang*, Yiqun Lin, Bin Pu, Jiarong GUO, Xiaowei Xu*, Xiaomeng Li*
[pdf]
[DOI]

LaMI-DETR: Open-Vocabulary Detection with Language Model Instruction
Penghui Du, Yu Wang, Yifan Sun, Luting Wang, Yue Liao, gang zhang, Errui Ding, Yan Wang*, Jingdong Wang, Si Liu*
[pdf]
[DOI]

Unrolled Decomposed Unpaired Learning for Controllable Low-Light Video Enhancement
Lingyu Zhu, Wenhan Yang, Baoliang Chen, Hanwei Zhu, Zhangkai Ni, Qi Mao, Shiqi Wang*
[pdf]
[DOI]

Efficient Image Pre-Training with Siamese Cropped Masked Autoencoders
Alexandre Eymaël, Renaud Vandeghen*, Anthony Cioppa, Silvio Giancola, Bernard Ghanem, Marc Van Droogenbroeck
[pdf]
[DOI]

VP-SAM: Taming Segment Anything Model for Video Polyp Segmentation via Disentanglement and Spatio-temporal Side Network
Zhixue Fang, Yuzhi Liu, Huisi Wu*, Jing Qin
[pdf]
[DOI]

Dataset Enhancement with Instance-Level Augmentations
Orest Kupyn*, Christian Rupprecht
[pdf]
[DOI]

FreeMotion: MoCap-Free Human Motion Synthesis with Multimodal Large Language Models
Zhikai Zhang, Yitang Li, Haofeng Huang, Mingxian Lin, Li Yi*
[pdf]
[DOI]

Chameleon: A Data-Efficient Generalist for Dense Visual Prediction in the Wild
Donggyun Kim, Seongwoong Cho, Semin Kim, Chong Luo, Seunghoon Hong*
[pdf]
[DOI]

Reliability in Semantic Segmentation: Can We Use Synthetic Data?
Thibaut Loiseau, Tuan-Hung Vu*, Mickael Chen, Patrick Pérez, Matthieu Cord
[pdf]
[DOI]

SCPNet: Unsupervised Cross-modal Homography Estimation via Intra-modal Self-supervised Learning
Runmin Zhang*, Jun Ma, Lun Luo, Beinan Yu, Shu-Jie Chen, Junwei Li, Hui-Liang Shen, Si-Yuan Cao*
[pdf]
[DOI]

SCAPE: A Simple and Strong Category-Agnostic Pose Estimator
Yujia Liang, Zixuan Ye, Wenze Liu, Hao Lu*
[pdf]
[DOI]

Elevating All Zero-Shot Sketch-Based Image Retrieval Through Multimodal Prompt Learning
Mainak Singha*, Ankit Jha, Divyam Gupta, Pranav Singla, Biplab Banerjee
[pdf]
[DOI]

Improving Knowledge Distillation via Regularizing Feature Direction and Norm
Yuzhu Wang, Lechao Cheng*, Manni Duan, Yongheng Wang, Zunlei Feng, Shu Kong
[pdf]
[DOI]

3DFG-PIFu: 3D Feature Grids for Human Digitization from Sparse Views
Kennard Yanting Chan*, Fayao Liu, Guosheng Lin, Chuan Sheng Foo, Weisi Lin
[pdf]
[DOI]

Lazy Diffusion Transformer for Interactive Image Editing
Yotam Nitzan*, Zongze Wu, Richard Zhang, Eli Shechtman, Danny Cohen-Or, Taesung Park, Michaël Gharbi
[pdf]
[DOI]

Non-parametric Sensor Noise Modeling and Synthesis
Ali Mosleh*, Luxi Zhao, Atin Vikram Singh, Jaeduk Han, Abhijith Punnappurath, Marcus A Brubaker, Jihwan Choe, Michael S Brown
[pdf]
[DOI]

Stripe Observation Guided Inference Cost-free Attention Mechanism
Zhongzhan Huang*, Shanshan Zhong, Wushao Wen, Jinghui Qin, Liang Lin*
[pdf]
[DOI]

The Nerfect Match: Exploring NeRF Features for Visual Localization
Qunjie Zhou*, Maxim Maximov, Or Litany, Laura Leal-Taixé
[pdf]
[DOI]

ComboVerse: Compositional 3D Assets Creation Using Spatially-Aware Diffusion Guidance
Yongwei Chen, Tengfei Wang, Tong Wu, Xingang Pan, Kui Jia*, Ziwei Liu
[pdf]
[DOI]

Robust Calibration of Large Vision-Language Adapters
Balamurali Murugesan*, Julio Silva-Rodríguez, Ismail Ben Ayed, Jose Dolz
[pdf]
[DOI]

Leveraging Hierarchical Feature Sharing for Efficient Dataset Condensation
Haizhong Zheng*, Jiachen Sun, Shutong Wu, Bhavya Kailkhura, Zhuoqing Morley Mao, Chaowei Xiao*, Atul Prakash*
[pdf]
[DOI]

Improving Domain Generalization in Self-Supervised Monocular Depth Estimation via Stabilized Adversarial Training
Yuanqi Yao*, Gang Wu, Kui Jiang, Siao Liu, Jian Kuai, Xianming Liu, Junjun Jiang*
[pdf]
[DOI]

milliFlow: Scene Flow Estimation on mmWave Radar Point Cloud for Human Motion Sensing
Fangqiang Ding*, Zhen Luo, Peijun Zhao, Chris Xiaoxuan Lu
[pdf]
[DOI]

denoiSplit: a method for joint microscopy image splitting and unsupervised denoising
Ashesh Ashesh*, Florian Jug*
[pdf]
[DOI]

AugDETR: Improving Multi-scale Learning for Detection Transformer
Jinpeng Dong, Yutong Lin, Chen Li, Sanping Zhou, Nanning Zheng*
[pdf]
[DOI]

Spherical World-Locking for Audio-Visual Localization in Egocentric Videos
Heeseung Yun*, Ruohan Gao, Ishwarya Ananthabhotla, Anurag Kumar, Jacob Donley, Chao Li, Gunhee Kim, Vamsi Krishna Ithapu, Calvin Murdock*
[pdf]
[DOI]

SPIN: Hierarchical Segmentation with Subpart Granularity in Natural Images
Josh David Myers-Dean*, Jarek T Reynolds, Brian Price, Yifei Fan, Danna Gurari
[pdf]
[DOI]

SIGMA: Sinkhorn-Guided Masked Video Modeling
Mohammadreza Salehi*, Michael Dorkenwald*, Fida Mohammad Thoker, Efstratios Gavves, Cees Snoek, Yuki M Asano
[pdf]
[DOI]

Generative Camera Dolly: Extreme Monocular Dynamic Novel View Synthesis
Basile Van Hoorick*, Rundi Wu, Ege Ozguroglu, Kyle Sargent, Ruoshi Liu, Pavel Tokmakov, Achal Dave, Changxi Zheng, Carl Vondrick
[pdf]
[DOI]

Distribution Alignment for Fully Test-Time Adaptation with Dynamic Online Data Streams
Ziqiang Wang*, Zhixiang Chi, Yanan Wu, Li Gu, Zhi Liu*, Konstantinos N Plataniotis*, Yang Wang*
[pdf]
[DOI]

Divide and Fuse: Body Part Mesh Recovery from Partially Visible Human Images
Tianyu Luan, Zhongpai Gao, Luyuan Xie, Abhishek Sharma, Hao Ding, Benjamin Planche, Meng Zheng, Ange Lou, Terrence Chen, Junsong Yuan, Ziyan Wu*
[pdf]
[DOI]

Understanding Physical Dynamics with Counterfactual World Modeling
Rahul Venkatesh*, Honglin Chen*, Kevin Feigelis, Daniel M Bear, Khaled Jedoui, Klemen Kotar, Felix J Binder, Wanhee Lee, Sherry Liu, Kevin Smith, Judith E. Fan, Daniel Yamins
[pdf]
[DOI]

MIGS: Multi-Identity Gaussian Splatting via Tensor Decomposition
Aggelina Chatziagapi*, Grigorios Chrysos, Dimitris Samaras
[pdf]
[DOI]

4Diff: 3D-Aware Diffusion Model for Third-to-First Viewpoint Translation
Feng Cheng*, Mi Luo*, Huiyu Wang, Alex Dimakis, Lorenzo Torresani, Gedas Bertasius, Kristen Grauman
[pdf]
[DOI]

Improving Point-based Crowd Counting and Localization Based on Auxiliary Point Guidance
I-HSIANG CHEN, Wei-Ting Chen, Yu-Wei Liu, Ming-Hsuan Yang, Sy-Yen Kuo*
[pdf]
[DOI]

Nymeria: A Massive Collection of Egocentric Multi-modal Human Motion in the Wild
Lingni Ma*, Yuting Ye, Rowan Postyeni, Alexander J Gamino, Vijay Baiyya, Luis Pesqueira, Kevin M Bailey, David Soriano Fosas, Fangzhou Hong, Vladimir Guzov, Yifeng Jiang, Hyo Jin Kim, Jakob Engel, Karen Liu, Ziwei Liu, Renzo De Nardi, Richard Newcombe
[pdf]
[DOI]

DreamStruct: Understanding Slides and User Interfaces via Synthetic Data Generation
Yi-Hao Peng*, Faria Huq, Yue Jiang, Jason Wu, Xin Yue Li, Jeffrey Bigham, Amy Pavel
[pdf]
[DOI]

SemTrack: A Large-scale Dataset for Semantic Tracking in the Wild
Pengfei Wang, Xiaofei Hui, Jing Wu, Zile Yang, Kian Eng Ong, Xinge Zhao, Beijia Lu, Dezhao Huang, Evan Ling, Weiling Chen, Keng Teck Ma, Minhoe Hur, Jun Liu*
[pdf]
[DOI]

VideoMamba: Spatio-Temporal Selective State Space Model
Jinyoung Park*, Hee-Seon Kim, Kangwook Ko, Minbeom Kim, Changick Kim
[pdf]
[DOI]

Text to Layer-wise 3D Clothed Human Generation
Junting Dong*, Qi Fang, Zehuan Huang, Xudong XU, Jingbo Wang, Sida Peng, Bo Dai
[pdf]
[DOI]

Texture-GS: Disentangle the Geometry and Texture for 3D Gaussian Splatting Editing
Tianxing Xu*, Wenbo Hu, Yu-Kun Lai, Ying Shan, Song-Hai Zhang
[pdf]
[DOI]

Fully Sparse 3D Occupancy Prediction
Haisong Liu, Yang Chen, Haiguang Wang, Zetong Yang, Tianyu Li, Jia Zeng, Li Chen, Hongyang Li, Limin Wang*
[pdf]
[DOI]

Is user feedback always informative? Retrieval Latent Defending for Semi-Supervised Domain Adaptation without Source Data
Junha Song*, Tae Soo Kim, Junha Kim, Gunhee Nam, Thijs Kooi, Jaegul Choo*
[pdf]
[DOI]

CG-SLAM: Efficient Dense RGB-D SLAM in a Consistent Uncertainty-aware 3D Gaussian Field
Jiarui Hu, Xianhao Chen, Boyin Feng, Guanglin Li, Liangjing Yang, Hujun Bao, Guofeng Zhang, Zhaopeng Cui*
[pdf]
[DOI]

Shifted Autoencoders for Point Annotation Restoration in Object Counting
Yuda Zou, Xin Xiao, Peilin Zhou, Zhichao Sun, Bo Du, Yongchao Xu*
[pdf]
[DOI]

PointLLM: Empowering Large Language Models to Understand Point Clouds
Runsen Xu*, Xiaolong Wang, Tai Wang*, Yilun Chen, Jiangmiao Pang*, Dahua Lin
[pdf]
[DOI]

GarmentAligner: Text-to-Garment Generation via Retrieval-augmented Multi-level Corrections
Shiyue Zhang, Zheng Chong, Xujie Zhang, Hanhui Li, Yuhao Cheng, yiqiang yan, Xiaodan Liang*
[pdf]
[DOI]

Improving Agent Behaviors with RL Fine-tuning for Autonomous Driving
Zhenghao Peng, Wenjie Luo, Yiren Lu*, Tianyi Shen, Cole Gulino, Ari Seff, Justin Fu
[pdf]
[DOI]

Enhancing Diffusion Models with Text-Encoder Reinforcement Learning
Chaofeng Chen*, Annan Wang, Haoning Wu, Liang Liao, Wenxiu Sun, Qiong Yan, Weisi Lin*
[pdf]
[DOI]

Asymmetric Mask Scheme for Self-Supervised Real Image Denoising
Xiangyu Liao*, Tianheng Zheng, Jiayu Zhong, Pingping Zhang, Chao Ren*
[pdf]
[DOI]

Omni6D: Large-Vocabulary 3D Object Dataset for Category-Level 6D Object Pose Estimation
Mengchen Zhang*, Tong Wu, Tai Wang, Tengfei Wang, Ziwei Liu, Dahua Lin*
[pdf]
[DOI]

BAD-Gaussians: Bundle Adjusted Deblur Gaussian Splatting
Lingzhe Zhao, Peng Wang, Peidong Liu*
[pdf]
[DOI]

Forest2Seq: Revitalizing Order Prior for Sequential Indoor Scene Synthesis
Qi Sun*, Hang Zhou, Wengang Zhou, Li Li, Houqiang Li
[pdf]
[DOI]

BaSIC: BayesNet Structure Learning for Computational Scalable Neural Image Compression
Yufeng Zhang, Hang Yu, Shizhan Liu, Wenrui Dai, Weiyao Lin*
[pdf]
[DOI]

FlexAttention for Efficient High-Resolution Vision-Language Models
Junyan Li*, Delin Chen, Tianle Cai, Peihao Chen, Yining Hong, Zhenfang Chen, Yikang Shen, Chuang Gan
[pdf]
[DOI]

Repaint123: Fast and High-quality One Image to 3D Generation with Progressive Controllable Repainting
Junwu Zhang*, Zhenyu Tang, Yatian Pang, Xinhua Cheng, Peng Jin, Yida Wei, xing zhou, munan ning, Li Yuan*
[pdf]
[DOI]

AnimatableDreamer: Text-Guided Non-rigid 3D Model Generation and Reconstruction with Canonical Score Distillation
Xinzhou Wang, Yikai Wang*, Junliang Ye, Fuchun Sun*, Zhengyi Wang, Ling Wang, Pengkun Liu, Kai Sun, Xintong Wang, Xie wende, Fangfu Liu, Bin He
[pdf]
[DOI]

Spatially-Variant Degradation Model for Dataset-free Super-resolution
SHAOJIE GUO, Haofei Song, Qingli Li, Yan Wang*
[pdf]
[DOI]

DreamView: Injecting View-specific Text Guidance into Text-to-3D Generation
Junkai Yan, Yipeng Gao, Qize Yang, Xihan Wei, Xuansong Xie, Ancong Wu*, WEI-SHI ZHENG*
[pdf]
[DOI]

Learning Exhaustive Correlation for Spectral Super-Resolution: Where Spatial-Spectral Attention Meets Linear Dependence
Hongyuan Wang, Lizhi Wang*, Jiang Xu, Chang Chen, Xue Hu, Fenglong Song, Youliang Yan
[pdf]
[DOI]

Local Action-Guided Motion Diffusion Model for Text-to-Motion Generation
Peng Jin*, Hao Li, Zesen Cheng, Kehan Li, Runyi Yu, Chang Liu*, Xiangyang Ji, Li Yuan*, Jie Chen
[pdf]
[DOI]

EAFormer: Scene Text Segmentation with Edge-Aware Transformers
Haiyang Yu, Teng Fu, Bin Li*, Xiangyang Xue
[pdf]
[DOI]

Benchmarks and Challenges in Pose Estimation for Egocentric Hand Interactions with Objects
Zicong Fan, Takehiko Ohkawa*, Linlin Yang, Nie Lin, Zhishan Zhou, Shihao Zhou, Jiajun Liang, Zhong Gao, Xuanyang Zhang, Xue Zhang, Fei Li, Liu Zheng, Feng Lu, Karim Abou Zeid, Bastian Leibe, Jeongwan On, Seungryul Baek, Aditya Prakash, Saurabh Gupta, Kun He, Yoichi Sato, Otmar Hilliges, Hyung Jin Chang, Angela Yao
[pdf]
[DOI]

DetailSemNet: Elevating Signature Verification through Detail-Semantic Integration
Meng-Cheng Shih*, Tsai-Ling Huang, Yu-Heng Shih, Hong-Han Shuai, Hsuan-Tung Liu, Yi-Ren Yeh, Ching-Chun Huang*
[pdf]
[DOI]

LaPose: Laplacian Mixture Shape Modeling for RGB-Based Category-Level Object Pose Estimation
Ruida Zhang, Ziqin Huang, Gu Wang, Chenyangguang Zhang, Yan Di, Xingxing Zuo, Jiwen Tang, Xiangyang Ji*
[pdf]
[DOI]

Upper-body Hierarchical Graph for Skeleton Based Emotion Recognition in Assistive Driving
Jiehui Wu, Jiansheng Chen*, Qifeng Luo, Siqi Liu, Youze Xue, Huimin Ma
[pdf]
[DOI]

Fine-Grained Scene Graph Generation via Sample-Level Bias Prediction
Yansheng Li, Tingzhu Wang*, Kang Wu, Linlin Wang, Xin Guo, Wenbin Wang
[pdf]
[DOI]

Exploring Guided Sampling of Conditional GANs
Yifei Zhang*, Mengfei Xia, Yujun Shen, Jiapeng Zhu, Ceyuan Yang, Kecheng Zheng, Lianghua Huang, Yu Liu, Fan Cheng*
[pdf]
[DOI]

MotionChain: Conversational Motion Controllers via Multimodal Prompts
Biao Jiang, Xin Chen, Chi Zhang, Fukun Yin, Zhuoyuan Li, Gang Yu, Jiayuan Fan*
[pdf]
[DOI]

Idempotent Unsupervised Representation Learning for Skeleton-Based Action Recognition
Lilang Lin, Lehong Wu, Jiahang Zhang, Jiaying Liu*
[pdf]
[DOI]

Latent Guard: a Safety Framework for Text-to-image Generation
Runtao Liu*, Ashkan Khakzar, Jindong Gu, Qifeng Chen*, Philip Torr, Fabio Pizzati*
[pdf]
[DOI]

MacDiff: Unified Skeleton Modeling with Masked Conditional Diffusion
Lehong Wu*, Lilang Lin, Jiahang Zhang, Yiyang Ma, Jiaying Liu*
[pdf]
[DOI]

TCC-Det: Temporarily consistent cues for weakly-supervised 3D detection
Jan Skvrna*, Lukáš Neumann
[pdf]
[DOI]

OPEN: Object-wise Position Embedding for Multi-view 3D Object Detection
Jinghua Hou, Tong Wang, Xiaoqing Ye, Zhe Liu, Shi Gong, Xiao Tan, Errui Ding, Jingdong Wang, Xiang Bai*
[pdf]
[DOI]

FoundPose: Unseen Object Pose Estimation with Foundation Features
Evin Pınar Örnek*, Yann Labbé, Bugra Tekin, Lingni Ma, Cem Keskin, Christian Forster, Tomas Hodan
[pdf]
[DOI]

Early Preparation Pays Off: New Classifier Pre-tuning for Class Incremental Semantic Segmentation
Zhengyuan Xie, Haiquan Lu, Jia-wen Xiao, Enguang Wang, Le Zhang, Xialei Liu*
[pdf]
[DOI]

Kalman-Inspired Feature Propagation for Video Face Super-Resolution
Ruicheng Feng, Chongyi Li, Chen Change Loy*
[pdf]
[DOI]

Select and Distill: Selective Dual-Teacher Knowledge Transfer for Continual Learning on Vision-Language Models
Yu-Chu Yu*, Chi-Pin Huang, Jr-Jen Chen, Kai-Po Chang, Yung-Hsuan Lai, Fu-En Yang, Yu-Chiang Frank Wang
[pdf]
[DOI]

VideoMamba: State Space Model for Efficient Video Understanding
Kunchang Li*, Xinhao Li, Yi Wang*, Yinan He, Yali Wang*, Limin Wang*, Yu Qiao*
[pdf]
[DOI]

SAFNet: Selective Alignment Fusion Network for Efficient HDR Imaging
Lingtong Kong*, Bo Li, Yike Xiong, Hao Zhang, Hong Gu, Jinwei Chen
[pdf]
[DOI]

Heterogeneous Graph Learning for Scene Graph Prediction in 3D Point Clouds
Yanni Ma, Hao Liu, Yun Pei, Yulan Guo*
[pdf]
[DOI]

Reason2Drive: Towards Interpretable and Chain-based Reasoning for Autonomous Driving
Ming Nie, Renyuan Peng, Chunwei Wang, Xinyue Cai, Jianhua Han, Hang Xu*, Li Zhang*
[pdf]
[DOI]

Omniview-Tuning: Boosting Viewpoint Invariance of Vision-Language Pre-training Models
Shouwei Ruan*, Yinpeng Dong, Liu Hanqing, Yao Huang, Hang Su, Xingxing Wei*
[pdf]
[DOI]

Deep Cost Ray Fusion for Sparse Depth Video Completion
Jungeon Kim, Soongjin Kim, Jaesik Park, Seungyong Lee*
[pdf]
[DOI]

GraphBEV: Towards Robust BEV Feature Alignment for Multi-Modal 3D Object Detection
Ziying Song, Lei Yang, Shaoqing Xu, Lin Liu, Dongyang Xu, Caiyan Jia*, Feiyang Jia, Li Wang
[pdf]
[DOI]

DINO-Tracker: Taming DINO for Self-Supervised Point Tracking in a Single Video
Narek Tumanyan*, Assaf Singer, Shai Bagon, Tali Dekel
[pdf]
[DOI]

GraspXL: Generating Grasping Motions for Diverse Objects at Scale
Hui Zhang*, Sammy Christen, Zicong Fan, Otmar Hilliges, Jie Song
[pdf]
[DOI]

Source Prompt Disentangled Inversion for Boosting Image Editability with Diffusion Models
Ruibin Li*, Ruihuang Li, Song Guo, Lei Zhang
[pdf]
[DOI]

Improving Intervention Efficacy via Concept Realignment in Concept Bottleneck Models
Nishad Singhi*, Jae Myung Kim, Karsten Roth, Zeynep Akata
[pdf]
[DOI]

JointDreamer: Ensuring Geometry Consistency and Text Congruence in Text-to-3D Generation via Joint Score Distillation
ChenHan Jiang*, Yihan Zeng, Tianyang Hu, Songcen Xu, Wei Zhang, Hang Xu, Dit-Yan Yeung
[pdf]
[DOI]

Brain Netflix: Scaling Data to Reconstruct Videos from Brain Signals
Camilo L Fosco*, Benjamin Lahner, Bowen Pan, Alex Andonian, Emilie L Josephs, Alex Lascelles, Aude Oliva
[pdf]
[DOI]

Equivariant Spatio-Temporal Self-Supervision for LiDAR Object Detection
Deepti Hegde, Suhas Lohit*, Kuan-Chuan Peng*, Michael J. Jones, Vishal M. Patel
[pdf]
[DOI]

"SLAck: Semantic, Location, and Appearance Aware Open-Vocabulary Tracking"
Siyuan Li*, Lei Ke, Yung-Hsu Yang, Luigi Piccinelli, Mattia Segù, Martin Danelljan, Luc Van Gool
[pdf]
[DOI]

Tensorial template matching for fast cross-correlation with rotations and its application for tomography
Antonio Martinez-Sanchez*, Ulrike Homberg, J. M. Almira, Harold Phelippeau
[pdf]
[DOI]

FreeAugment: Data Augmentation Search Across All Degrees of Freedom
Tom Bekor*, Niv Nayman, Lihi Zelnik-Manor
[pdf]
[DOI]

Learning Representations of Satellite Images From Metadata Supervision
Jules Bourcier*, Gohar Dashyan, Karteek Alahari, Jocelyn Chanussot
[pdf]
[DOI]

I2-SLAM: Inverting Imaging Process for Robust Photorealistic Dense SLAM
Gwangtak Bae, Changwoon Choi, Hyeongjun Heo, Sang Min Kim, Young Min Kim*
[pdf]
[DOI]

FlashTex: Fast Relightable Mesh Texturing with LightControlNet
Kangle Deng*, Timothy Omernick, Alexander B Weiss, Deva Ramanan, Jun-Yan Zhu, Tinghui Zhou, Maneesh Agrawala
[pdf]
[DOI]

GS-Pose: Category-Level Object Pose Estimation via Geometric and Semantic Correspondence
Pengyuan Wang*, Takuya Ikeda, Robert Lee, Koichi Nishiwaki
[pdf]
[DOI]

ArtVLM: Attribute Recognition Through Vision-Based Prefix Language Modeling
William Yicheng Zhu*, Keren Ye*, Junjie Ke, Jiahui Yu, Leonidas Guibas, Peyman Milanfar, Feng Yang*
[pdf]
[DOI]

PanoFree: Tuning-Free Holistic Multi-view Image Generation with Cross-view Self-Guidance
Aoming Liu*, Zhong Li*, Zhang Chen*, Nannan Li, Yi Xu, Bryan Plummer
[pdf]
[DOI]

SOS: Segment Object System for Open-World Instance Segmentation With Object Priors
Christian Wilms*, Tim Rolff, Maris N Hillemann, Robert Johanson, Simone Frintrop
[pdf]
[DOI]

Lagrangian Hashing for Compressed Neural Field Representations
Shrisudhan Govindarajan*, Zeno Sambugaro, Akhmedkhan Shabanov, Towaki Takikawa, Weiwei Sun, Daniel Rebain, Nicola Conci, Kwang Moo Yi, Andrea Tagliasacchi
[pdf]
[DOI]

EDformer: Transformer-Based Event Denoising Across Varied Noise Levels
Bin Jiang, Bo Xiong, Bohan Qu, M. Salman Asif, You Zhou*, Zhan Ma*
[pdf]
[DOI]

Foster Adaptivity and Balance in Learning with Noisy Labels
Mengmeng Sheng, Zeren Sun*, Tao Chen, Shuchao Pang, yucheng wang, Yazhou Yao*
[pdf]
[DOI]

MetaAug: Meta-Data Augmentation for Post-Training Quantization
Cuong Van Pham*, Hoang Anh Dung, Cuong Cao Nguyen, Trung Le, Dinh Phung, Gustavo Carneiro, Thanh-Toan Do
[pdf]
[DOI]

Thermal3D-GS: Physics-induced 3D Gaussians for Thermal Infrared Novel-view Synthesis
Qian Chen, Shihao Shu, Xiangzhi Bai*
[pdf]
[DOI]

Cross-Platform Video Person ReID: A New Benchmark Dataset and Adaptation Approach
Shizhou Zhang, Wenlong Luo, De Cheng*, Qingchun Yang, Lingyan Ran, Yinghui Xing, Yanning Zhang
[pdf]
[DOI]

Unleashing the Power of Prompt-driven Nucleus Instance Segmentation
Zhongyi Shui*, Yunlong Zhang, Kai Yao, Chenglu Zhu, Sunyi Zheng, Jingxiong Li, Honglin Li, YUXUAN SUN, Ruizhe Guo, Lin Yang*
[pdf]
[DOI]

Gaze Target Detection Based on Head-Local-Global Coordination
Yaokun Yang, Feng Lu*
[pdf]
[DOI]

3DSA:Multi-View 3D Human Pose Estimation With 3D Space Attention Mechanisms
Po Han Chen, Chia-Chi Tsai*
[pdf]
[DOI]

Toward Tiny and High-quality Facial Makeup with Data Amplify Learning
Qiaoqiao Jin, Xuanhong Chen, Meiguang Jin, Ying Chen, Rui Shi, Yucheng Zheng, Yupeng Zhu, Bingbing Ni*
[pdf]
[DOI]

An Economic Framework for 6-DoF Grasp Detection
Xiao-Ming Wu*, Jia-Feng Cai, Jian-Jian Jiang, Dian Zheng, Yi-Lin Wei, Wei-Shi Zheng*
[pdf]
[DOI]

GaussianFormer: Scene as Gaussians for Vision-Based 3D Semantic Occupancy Prediction
Yuanhui Huang, Wenzhao Zheng, Yunpeng Zhang, Jie Zhou, Jiwen Lu*
[pdf]
[DOI]

Powerful and Flexible: Personalized Text-to-Image Generation via Reinforcement Learning
Fanyue Wei, Wei Zeng, Zhenyang Li, Dawei Yin, Lixin Duan, Wen Li*
[pdf]
[DOI]

AdaLog: Post-Training Quantization for Vision Transformers with Adaptive Logarithm Quantizer
Zhuguanyu Wu, Jiaxin Chen*, Hanwen Zhong, Di Huang, Yunhong Wang
[pdf]
[DOI]

Multi-Label Cluster Discrimination for Visual Representation Learning
Xiang An, Kaicheng Yang, Xiangzi Dai, Ziyong Feng, Jiankang Deng*
[pdf]
[DOI]

"Plan, Posture and Go: Towards Open-vocabulary Text-to-Motion Generation"
Jinpeng Liu, Wenxun Dai, Chunyu Wang, Yiji Cheng, Yansong Tang*, Xin Tong
[pdf]
[DOI]

DAMSDet: Dynamic Adaptive Multispectral Detection Transformer with Competitive Query Selection and Adaptive Feature Fusion
Junjie Guo*, Chenqiang Gao*, Fangcen Liu, Deyu Meng, Xinbo Gao
[pdf]
[DOI]

CLIP-Guided Generative Networks for Transferable Targeted Adversarial Attacks
Hao Fang, Jiawei Kong, Bin Chen*, Tao Dai, Hao Wu, Shu-Tao Xia
[pdf]
[DOI]

Flash Cache: Reducing Bias in Radiance Cache Based Inverse Rendering
Benjamin Attal*, Dor Verbin, Ben Mildenhall, Peter Hedman, Jonathan T Barron, Matthew O'Toole, Pratul Srinivasan
[pdf]
[DOI]

Progressive Classifier and Feature Extractor Adaptation for Unsupervised Domain Adaptation on Point Clouds
Zicheng Wang, Zhen Zhao, Yiming Wu, Luping Zhou*, Dong Xu*
[pdf]
[DOI]

A New Dataset and Framework for Real-World Blurred Images Super-Resolution
Rui Qin, Ming Sun, Chao Zhou, Bin Wang*
[pdf]
[DOI]

AddressCLIP: Empowering Vision-Language Models for City-wide Image Address Localization
Shixiong Xu, Chenghao Zhang, Lubin Fan*, Gaofeng Meng*, SHIMING XIANG, Jieping Ye
[pdf]
[DOI]

RISurConv: Rotation Invariant Surface Attention-Augmented Convolutions for 3D Point Cloud Classification and Segmentation
Zhiyuan Zhang*, Licheng Yang, Zhiyu Xiang
[pdf]
[DOI]

StyleTokenizer: Defining Image Style by a Single Instance for Controlling Diffusion Models
Wen Li*, Muyuan Fang, Cheng Zou, Biao Gong, Ruobing Zheng, Meng Wang, Jingdong Chen, Ming Yang
[pdf]
[DOI]

Bidirectional Uncertainty-Based Active Learning for Open-Set Annotation
Chen-Chen Zong, Ye-Wen Wang, Kun-Peng Ning, Hai-Bo Ye, Sheng-Jun Huang*
[pdf]
[DOI]

Preventing Catastrophic Overfitting in Fast Adversarial Training: A Bi-level Optimization Perspective
Zhaoxin Wang*, Handing Wang*, Cong Tian, Yaochu Jin
[pdf]
[DOI]

Projecting Points to Axes: Oriented Object Detection via Point-Axis Representation
Zeyang Zhao, Qilong Xue, Yifan Bai, Yuhang He, Xing Wei*, Yihong Gong
[pdf]
[DOI]

SeiT++: Masked Token Modeling Improves Storage-efficient Training
Minhyun Lee, Song Park, Byeongho Heo, Dongyoon Han, Hyunjung Shim*
[pdf]
[DOI]

Rectify the Regression Bias in Long-Tailed Object Detection
Ke Zhu, Minghao Fu, Jie Shao, Tianyu Liu, Jianxin Wu*
[pdf]
[DOI]

MagicEraser: Erasing Any Objects via Semantics-Aware Control
Fan Li*, Zixiao Zhang, Yi Huang, Jianzhuang Liu, Renjing Pei, Bin Shao, Songcen Xu
[pdf]
[DOI]

Reliable Spatial-Temporal Voxels For Multi-Modal Test-Time Adaptation
Haozhi Cao, Yuecong Xu, Jianfei Yang*, Pengyu Yin, Xingyu Ji, Shenghai Yuan, Lihua Xie
[pdf]
[DOI]

Stable Preference: Redefining training paradigm of human preference model for Text-to-Image Synthesis
Hanting Li, Hongjing Niu, Feng Zhao*
[pdf]
[DOI]

SparseSSP: 3D Subcellular Structure Prediction from Sparse-View Transmitted Light Images
Jintu Zheng, Yi Ding, Qizhe Liu, Yuehui Chen, Yi Cao, Ying Hu, Zenan Wang*
[pdf]
[DOI]

NL2Contact: Natural Language Guided 3D Hand-Object Contact Modeling with Diffusion Model
Zhongqun Zhang*, Hengfei Wang, Ziwei Yu, Yihua Cheng*, Angela Yao, Hyung Jin Chang
[pdf]
[DOI]

Self-Adapting Large Visual-Language Models to Edge Devices across Visual Modalities
Kaiwen Cai, ZheKai Duan, Gaowen Liu, Charles Fleming, Chris Xiaoxuan Lu*
[pdf]
[DOI]

Diff-Tracker: Text-to-Image Diffusion Models are Unsupervised Trackers
Zhengbo Zhang*, Li Xu, Duo Peng, Hossein Rahmani, Jun Liu*
[pdf]
[DOI]

Rethinking Tree-Ring Watermarking for Enhanced Multi-Key Identification
Hai Ci*, Pei Yang, Yiren Song, Mike Zheng Shou*
[pdf]
[DOI]

3D Small Object Detection with Dynamic Spatial Pruning
Zhihao Sun, Ziwei Wang, Hongmin Liu, Jie Zhou, Jiwen Lu*, Xiuwei Xu*
[pdf]
[DOI]

STSP: Spatial-Temporal Subspace Projection for Video Class-incremental Learning
Hao Cheng, SIYUAN YANG, Chong Wang, Joey Tianyi Zhou, Alex Kot, Bihan Wen*
[pdf]
[DOI]

Transferable 3D Adversarial Shape Completion using Diffusion Models
Xuelong Dai*, Bin Xiao
[pdf]
[DOI]

OmniSat: Self-Supervised Modality Fusion for Earth Observation
Guillaume Astruc*, Nicolas Gonthier, Clement Mallet, Loic Landrieu
[pdf]
[DOI]

Distilling Diffusion Models into Conditional GANs
MinGuk Kang*, Richard Zhang, Connelly Barnes, Sylvain Paris, Suha Kwak, Jaesik Park, Eli Shechtman, Jun-Yan Zhu, Taesung Park*
[pdf]
[DOI]

Semantically Guided Representation Learning For Action Anticipation
Anxhelo Diko*, Danilo Avola, Bardh Prenkaj, Federico Fontana, Luigi Cinque
[pdf]
[DOI]

MemBN: Robust Test-Time Adaptation via Batch Norm with Statistics Memory
Juwon Kang*, Nayeong Kim, Jungseul Ok, Suha Kwak*
[pdf]
[DOI]

FREST: Feature RESToration for Semantic Segmentation under Multiple Adverse Conditions
Sohyun Lee, Namyup Kim, Sungyeon Kim, Suha Kwak*
[pdf]
[DOI]

ScanTalk: 3D Talking Heads from Unregistered Scans
Federico Nocentini*, Thomas Besnier, Claudio Ferrari, Sylvain Arguillere, Stefano Berretti, Mohamed Daoudi
[pdf]
[DOI]

Controllable Navigation Instruction Generation with Chain of Thought Prompting
Xianghao Kong, Jinyu Chen, Wenguan Wang*, Hang Su, Xiaolin Hu, Yi Yang, Si Liu*
[pdf]
[DOI]

GiT: Towards Generalist Vision Transformer through Universal Language Interface
Haiyang Wang*, Hao Tang, Li Jiang, Shaoshuai Shi, Muhammad Ferjad Naeem, Hongsheng Li, Bernt Schiele, Liwei Wang
[pdf]
[DOI]

ScatterFormer: Efficient Voxel Transformer with Scattered Linear Attention
Chenhang He*, Ruihuang Li, Guowen Zhang, Lei Zhang
[pdf]
[DOI]

A Cephalometric Landmark Regression Method based on Dual-encoder for High-resolution X-ray Image
Chao Dai, yang wang*, Chaolin Huang, zhou jiakai, Qilin Xu, Minpeng Xu
[pdf]
[DOI]

Exploring the Feature Extraction and Relation Modeling For Light-Weight Transformer Tracking
Jikai Zheng, Mingjiang Liang, Shaoli Huang, Jifeng Ning*
[pdf]
[DOI]

LiveHPS++: Robust and Coherent Motion Capture in Dynamic Free Environment
Yiming Ren, Xiao Han, Yichen Yao, Xiaoxiao Long, Yujing Sun*, Yuexin Ma*
[pdf]
[DOI]

You Only Need One Step: Fast Super-Resolution with Stable Diffusion via Scale Distillation
Mehdi Noroozi*, Isma Hadji*, Brais Martinez*, Adrian Bulat*, Georgios Tzimiropoulos*
[pdf]
[DOI]

Gaussian Grouping: Segment and Edit Anything in 3D Scenes
Mingqiao Ye, Martin Danelljan, Fisher Yu, Lei Ke*
[pdf]
[DOI]

CoMo: Controllable Motion Generation through Language Guided Pose Code Editing
Yiming Huang*, Weilin Wan, Yue Yang, Chris Callison-Burch, Mark Yatskar, Lingjie Liu
[pdf]
[DOI]

MegaScenes: Scene-Level View Synthesis at Scale
Joseph Tung, Gene Chou*, Ruojin Cai, Guandao Yang, Kai Zhang, Gordon Wetzstein, Bharath Hariharan, Noah Snavely
[pdf]
[DOI]

SuperGaussian: Repurposing Video Models for 3D Super Resolution
Yuan Shen*, Duygu Ceylan*, Paul Guerrero, Zexiang Xu, Niloy J. Mitra, Shenlong Wang, Anna Fruehstueck*
[pdf]
[DOI]

Towards Model-Agnostic Dataset Condensation by Heterogeneous Models
Jun-Yeong Moon, Jung Uk Kim*, Gyeong-Moon Park*
[pdf]
[DOI]

Goldfish: Vision-Language Understanding of Arbitrarily Long Videos
Kirolos Ataallah*, Xiaoqian shen, Eslam mohamed abdelrahman*, Essam Sleiman, Mingchen Zhuge, Jian Ding, Deyao Zhu, Jürgen Schmidhuber, Mohamed Elhoseiny
[pdf]
[DOI]

MeshFeat: Multi-Resolution Features for Neural Fields on Meshes
Mihir Mahajan*, Florian Hofherr*, Daniel Cremers
[pdf]
[DOI]

Decoupling Common and Unique Representations for Multimodal Self-supervised Learning
Yi Wang*, Conrad M Albrecht, Nassim Ait Ali Braham, Chenying Liu, Zhitong Xiong, Xiao Xiang Zhu
[pdf]
[DOI]

"MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training"
Brandon McKinzie, Zhe Gan, Jean-Philippe Fauconnier, Samuel Dodge, Bowen Zhang, Philipp Dufter, Dhruti Shah, Futang Peng, Anton Belyi, Max A Schwarzer, Hongyu Hè, Xianzhi Du, Haotian Zhang, Karanjeet Singh, Doug Kang, Tom Gunter, Xiang Kong, Aonan Zhang, Jianyu Wang, Chong Wang, Nan Du, Tao Lei, Sam Wiseman, Mark Lee, Zirui Wang, Ruoming Pang, Peter Grasch, Alexander Toshev*, Yinfei Yang
[pdf]
[DOI]

Optimizing Diffusion Models for Joint Trajectory Prediction and Controllable Generation
Yixiao Wang*, Chen Tang, Lingfeng Sun, Simone Rossi, Yichen Xie, Chensheng Peng, Thomas Hannagan, Stefano Sabatini, Nicola Poerio, Masayoshi TOMIZUKA, Wei Zhan
[pdf]
[DOI]

2S-ODIS: Two-Stage Omni-Directional Image Synthesis by Geometric Distortion Correction
Atsuya Nakata*, Takao Yamanaka*
[pdf]
[DOI]

Open-Vocabulary 3D Semantic Segmentation with Text-to-Image Diffusion Models
Xiaoyu Zhu*, Hao Zhou, Pengfei Xing, Long Zhao, Hao Xu, Junwei Liang, Alexander G. Hauptmann, Ting Liu, Andrew Gallagher
[pdf]
[DOI]

D-SCo: Dual-Stream Conditional Diffusion for Monocular Hand-Held Object Reconstruction
Bowen Fu*, Gu Wang*, Chenyangguang Zhang, Yan Di, Ziqin Huang, Zhiying Leng, Fabian Manhardt, Xiangyang Ji*, Federico Tombari*
[pdf]
[DOI]

Combining Generative and Geometry Priors for Wide-Angle Portrait Correction
Lan Yao, Chaofeng Chen, Xiaoming Li*, Zifei Yan, Wangmeng Zuo
[pdf]
[DOI]

RealViformer: Investigating Attention for Real-World Video Super-Resolution
Yuehan Zhang*, Angela Yao
[pdf]
[DOI]

Pairwise Distance Distillation for Unsupervised Real-World Image Super-Resolution
Yuehan Zhang*, Seungjun Lee, Angela Yao
[pdf]
[DOI]

Decomposed Vector-Quantized Variational Autoencoder for Human Grasp Generation
zhao zhe*, Mengshi Qi, Huadong Ma
[pdf]
[DOI]

UniFS: Universal Few-shot Instance Perception with Point Representations
Sheng Jin*, Ruijie Yao, Lumin Xu, Wentao Liu*, Chen Qian, Ji Wu, Ping Luo*
[pdf]
[DOI]

SemanticHuman-HD: High Resolution Semantic disentangled 3D Human Generation
Peng Zheng, Tao Liu, Zili Yi, Rui Ma*
[pdf]
[DOI]

CoherentGS: Sparse Novel View Synthesis with Coherent 3D Gaussians
Avinash Paliwal*, Wei Ye, Jinhui Xiong, Dmytro Kotovenko, Rakesh Ranjan, Vikas Chandra, Nima Khademi Kalantari
[pdf]
[DOI]

Monocular Occupancy Prediction for Scalable Indoor Scenes
Hongxiao Yu, Yuqi Wang, Yuntao Chen, Zhaoxiang Zhang*
[pdf]
[DOI]

Visual Grounding for Object-Level Generalization in Reinforcement Learning
Haobin Jiang, Zongqing Lu*
[pdf]
[DOI]

3DEgo: 3D Editing on the Go!
Umar Khalid*, Hasan Iqbal*, Azib Farooq, Jing Hua, Chen Chen*
[pdf]
[DOI]

Efficient Depth-Guided Urban View Synthesis
sheng miao*, Jiaxin Huang, Dongfeng Bai, Weichao Qiu, Liu Bingbing, Andreas Geiger, Yiyi Liao
[pdf]
[DOI]

Probabilistic Weather Forecasting with Deterministic Guidance-based Diffusion Model
Donggeun Yoon, Minseok Seo, Doyi Kim, Yeji Choi, Donghyeon Cho*
[pdf]
[DOI]

Domain-adaptive Video Deblurring via Test-time Blurring
Jin-Ting He*, Fu-Jen Tsai, Jia-Hao Wu, Yan-Tsung Peng, Chung-Chi Tsai, Chia-Wen Lin, Yen-Yu Lin
[pdf]
[DOI]

Representing Topological Self-Similarity Using Fractal Feature Maps for Accurate Segmentation of Tubular Structures
Jiaxing Huang, Yanfeng Zhou, Yaoru Luo, Guole Liu, Heng Guo, Ge Yang*
[pdf]
[DOI]

NeuroNCAP: Photorealistic Closed-loop Safety Testing for Autonomous Driving
William Ljungbergh*, Adam Tonderski, Joakim Johnander, Holger Caesar, Kalle Åström, Michael Felsberg, Christoffer Petersson
[pdf]
[DOI]

OLAF: A Plug-and-Play Framework for Enhanced Multi-object Multi-part Scene Parsing
Pranav Gupta*, Rishubh Singh, Pradeep Shenoy, Ravi Kiran Sarvadevabhatla*
[pdf]
[DOI]

Progressive Pretext Task Learning for Human Trajectory Prediction
Xiaotong Lin, Tianming Liang, Jianhuang Lai, Jian-Fang Hu*
[pdf]
[DOI]

"Hyperion – A fast, versatile symbolic Gaussian Belief Propagation framework for Continuous-Time SLAM"
David Hug*, Ignacio Alzugaray, Margarita Chli
[pdf]
[DOI]

Isomorphic Pruning for Vision Models
Gongfan Fang*, Xinyin Ma, Michael Bi Mi, Xinchao Wang*
[pdf]
[DOI]

Attention Prompting on Image for Large Vision-Language Models
Runpeng Yu*, Weihao Yu*, Xinchao Wang*
[pdf]
[DOI]

Learning Cross-hand Policies of High-DOF Reaching and Grasping
Qijin She, Shishun Zhang, Yunfan Ye, Ruizhen Hu, Kai Xu*
[pdf]
[DOI]

Reprojection Errors as Prompts for Efficient Scene Coordinate Regression
Ting-Ru Liu*, Hsuan-Kung Yang, Jou-Min Liu, Chun-Wei Huang, Tsung-Chih Chiang, Quan Kong, Norimasa Kobori, Chun-Yi Lee
[pdf]
[DOI]

Diffusion-Driven Data Replay: A Novel Approach to Combat Forgetting in Federated Class Continual Learning
Jinglin Liang, Jin Zhong, Hanlin Gu, Zhongqi Lu, Xingxing Tang, Gang Dai, Shuangping Huang*, Lixin Fan, Qiang Yang
[pdf]
[DOI]

Long-Tail Temporal Action Segmentation with Group-wise Temporal Logit Adjustment
Zhanzhong Pang*, Fadime Sener, Shrinivas Ramasubramanian, Angela Yao
[pdf]
[DOI]

REVISION: Rendering Tools Enable Spatial Fidelity in Vision-Language Models
Agneet Chatterjee*, Yiran Luo, Tejas Gokhale, Yezhou Yang, Chitta R Baral
[pdf]
[DOI]

DreamMotion: Space-Time Self-Similar Score Distillation for Zero-Shot Video Editing
Hyeonho Jeong, Jinho Chang, Geon Yeong Park, Jong Chul Ye*
[pdf]
[DOI]

VideoClusterNet: Self-Supervised and Adaptive Face Clustering for Videos
Devesh Walawalkar*, Pablo Garrido
[pdf]
[DOI]

Unveiling Privacy Risks in Stochastic Neural Networks Training: Effective Image Reconstruction from Gradients
Yiming Chen*, Xiangyu Yang, Nikos Deligiannis
[pdf]
[DOI]

Controlling the World by Sleight of Hand
Sruthi Sudhakar*, Ruoshi Liu, Basile Van Hoorick, Carl Vondrick, Richard Zemel
[pdf]
[DOI]

Hiding Imperceptible Noise in Curvature-Aware Patches for 3D Point Cloud Attack
Mingyu Yang*, Daizong Liu, Keke Tang, Pan Zhou, Lixing Chen, Junyang Chen
[pdf]
[DOI]

Interleaving One-Class and Weakly-Supervised Models with Adaptive Thresholding for Unsupervised Video Anomaly Detection
Yongwei Nie, Hao Huang, Chengjiang Long, Qing Zhang, Pradipta Maji, Hongmin Cai*
[pdf]
[DOI]

Cross-Domain Learning for Video Anomaly Detection with Limited Supervision
Yashika Jain, Ali Dabouei*, Min Xu*
[pdf]
[DOI]

YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information
Chien-Yao Wang*, I-Hau Yeh, Hong-Yuan Mark Liao
[pdf]
[DOI]

Unsupervised Multi-modal Medical Image Registration via Invertible Translation
Mengjie Guo*
[pdf]
[DOI]

Functional Transform-Based Low-Rank Tensor Factorization for Multi-Dimensional Data Recovery
Jian-Li Wang, Xi-Le Zhao*
[pdf]
[DOI]

CRM: Single Image to 3D Textured Mesh with Convolutional Reconstruction Model
Zhengyi Wang*, Yikai Wang, Yifei Chen, Chendong Xiang, Shuo Chen, Dajiang Yu, Chongxuan Li, Hang Su, Jun Zhu
[pdf]
[DOI]

Domain Reduction Strategy for Non-Line-of-Sight Imaging
Hyunbo Shim, In Cho, Daekyu Kwon, Seon Joo Kim*
[pdf]
[DOI]

HPE-Li: WiFi-enabled Lightweight Dual Selective Kernel Convolution for Human Pose Estimation
Toan D. Gian, Tien Dac Lai, Thien Van Luong, Kok-Seng Wong, Van-Dinh Nguyen*
[pdf]
[DOI]

Cut out the Middleman: Revisiting Pose-based Gait Recognition
Yang Fu, Saihui Hou*, Shibei Meng, Xuecai Hu*, Chunshui Cao, Xu Liu, Yongzhen Huang
[pdf]
[DOI]

HiEI: A Universal Framework for Generating High-quality Emerging Images from Natural Images
Jingmeng Li, Lukang Fu, Surun Yang, Hui Wei*
[pdf]
[DOI]

High-Precision Self-Supervised Monocular Depth Estimation with Rich-Resource Prior
Jianbing Shen*, Wencheng Han
[pdf]
[DOI]

SGS-SLAM: Semantic Gaussian Splatting For Neural Dense SLAM
Mingrui Li, Shuhong Liu, Heng Zhou, Guohao Zhu, Na Cheng, Tianchen Deng, Hongyu Wang*
[pdf]
[DOI]

View Selection for 3D Captioning via Diffusion Ranking
Tiange Luo*, Justin Johnson, Honglak Lee
[pdf]
[DOI]

OmniSSR: Zero-shot Omnidirectional Image Super-Resolution using Stable Diffusion Model
Runyi Li*, Xuhan Sheng, Weiqi Li, Jian Zhang*
[pdf]
[DOI]

UDiffText: A Unified Framework for High-quality Text Synthesis in Arbitrary Images via Character-aware Diffusion Models
Yiming Zhao*, Zhouhui Lian*
[pdf]
[DOI]

Confidence Self-Calibration for Multi-Label Class-Incremental Learning
Kaile Du*, Yifan Zhou, Fan Lyu, Yuyang Li, Chen Lu, Guangcan Liu*
[pdf]
[DOI]

OMG: Occlusion-friendly Personalized Multi-concept Generation in Diffusion Models
Zhe Kong*, Yong Zhang*, Tianyu Yang, Tao Wang, Kaihao Zhang, Bizhu Wu, Guanying Chen, Wei Liu, Wenhan Luo*
[pdf]
[DOI]

Versatile Incremental Learning: Towards Class and Domain-Agnostic Incremental Learning
Min-Yeong Park, Jae-Ho Lee, Gyeong-Moon Park*
[pdf]
[DOI]

WeCromCL: Weakly Supervised Cross-Modality Contrastive Learning for Transcription-only Supervised Text Spotting
Jingjing Wu, Zhengyao Fang, Pengyuan Lyu, Chengquan Zhang, Fanglin Chen, Guangming Lu, Wenjie Pei*
[pdf]
[DOI]

An Incremental Unified Framework for Small Defect Inspection
Jiaqi Tang, Hao Lu, Xiaogang Xu, Ruizheng Wu, Sixing Hu, Tong Zhang, Tsz Wa Cheng, Ming Ge, Ying-Cong Chen*, Fugee Tsung
[pdf]
[DOI]

Enhancing Optimization Robustness in 1-bit Neural Networks through Stochastic Sign Descent
NianHui Guo*, Hong Guo, Christoph Meinel, Haojin Yang
[pdf]
[DOI]

Temporally Consistent Stereo Matching
Jiaxi Zeng*, Chengtang Yao, Yuwei Wu*, Yunde Jia
[pdf]
[DOI]

A Rotation-invariant Texture ViT for Fine-Grained Recognition of Esophageal Cancer Endoscopic Ultrasound Images
Tianyi Liu, Shuaishuai S Zhuang, Jiacheng Nie, Geng Chen , Yusheng Guo, Guangquan Zhou*, Jean-Louis Coatrieux, Yang Chen*
[pdf]
[DOI]

BI-MDRG: Bridging Image History in Multimodal Dialogue Response Generation
Hee Suk Yoon, Eunseop Yoon, Joshua Tian Jin Tee, Kang Zhang, Yu-Jung Heo, Du-Seong Chang, Chang D. Yoo*
[pdf]
[DOI]

Adapting Fine-Grained Cross-View Localization to Areas without Fine Ground Truth
Zimin Xia*, Yujiao Shi, Hongdong Li, Julian F. P. Kooij
[pdf]
[DOI]

BeNeRF:Neural Radiance Fields from a Single Blurry Image and Event Stream
Wenpu Li, Pian Wan, Peng Wang, Jinghang Li, Yi Zhou, Peidong Liu*
[pdf]
[DOI]

Human Motion Forecasting in Dynamic Domain Shifts: A Homeostatic Continual Test-time Adaptation Framework
Qiongjie Cui*, Huaijiang Sun, Bin Li, Jianfeng Lu, Weiqing Li
[pdf]
[DOI]

CloudFixer: Test-Time Adaptation for 3D Point Clouds via Diffusion-Guided Geometric Transformation
Hajin Shim, Changhun Kim, Eunho Yang*
[pdf]
[DOI]

DreamDiffusion: High-Quality EEG-to-Image Generation with Temporal Masked Signal Modeling and CLIP Alignment
Yunpeng Bai*, Xintao Wang, Yan-Pei Cao, Yixiao Ge, Chun Yuan, Ying Shan
[pdf]
[DOI]

FRI-Net: Floorplan Reconstruction via Room-wise Implicit Representation
Honghao Xu, Juzhan Xu, Zeyu Huang, Pengfei Xu, Hui Huang, Ruizhen Hu*
[pdf]
[DOI]

BugNIST - a Large Volumetric Dataset for Detection under Domain Shift
Patrick M Jensen, Vedrana A Dahl, Rebecca Engberg, Carsten Gundlach, Hans Martin Kjer, Anders B Dahl*
[pdf]
[DOI]

SCP-Diff: Spatial-Categorical Joint Prior for Diffusion Based Semantic Image Synthesis
Huan-ang Gao, Mingju Gao, Jiaju Li, Wenyi Li, Rong Zhi, Hao Tang, Hao Zhao*
[pdf]
[DOI]

PoseAugment: Generative Human Pose Data Augmentation with Physical Plausibility for IMU-based Motion Capture
Zhuojun Li*, Chun Yu*, Chen Liang, Yuanchun Shi
[pdf]
[DOI]

PixArt-Sigma: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation
Junsong Chen, Chongjian GE, Enze Xie*, Yue Wu, Lewei Yao, Xiaozhe Ren, Zhongdao Wang, Ping Luo, Huchuan Lu, Zhenguo Li
[pdf]
[DOI]

Hierarchical Gaussian Mixture Normalizing Flow Modeling for Unified Anomaly Detection
Xincheng Yao*, Ruoqi Li, Zefeng Qian, lu wang, Chongyang Zhang*
[pdf]
[DOI]

A Closer Look at GAN Priors: Exploiting Intermediate Features for Enhanced Model Inversion Attacks
Yixiang Qiu*, Hao Fang, Hongyao Yu, Bin Chen*, Meikang Qiu, Shu-Tao Xia
[pdf]
[DOI]

Improving Unsupervised Domain Adaptation: A Pseudo-Candidate Set Approach
Aveen Dayal*, Rishabh Lalla, Linga Reddy Cenkeramaddi, C. Krishna Mohan, Abhinav Kumar, Vineeth N Balasubramanian
[pdf]
[DOI]

HeadStudio: Text to Animatable Head Avatars with 3D Gaussian Splatting
Zhenglin Zhou*, Fan Ma, Hehe Fan, Zongxin Yang, Yi Yang
[pdf]
[DOI]

DetToolChain: A New Prompting Paradigm to Unleash Detection Ability of MLLM
Yixuan Wu*, Yizhou Wang, Shixiang Tang, Wenhao Wu, Tong He, Wanli Ouyang, Philip Torr, Jian Wu
[pdf]
[DOI]

Surface-Centric Modeling for High-Fidelity Generalizable Neural Surface Reconstruction
Rui Peng, Shihe Shen, Kaiqiang Xiong, Huachen Gao, Jianbo Jiao, Xiaodong Gu, Ronggang Wang*
[pdf]
[DOI]

HumanRefiner: Benchmarking Abnormal Human Generation and Refining with Coarse-to-fine Pose-Reversible Guidance
Guian Fang*, Wenbiao Yan, Yuanfan Guo, Jianhua Han, Zutao Jiang, Hang Xu, Shengcai Liao, Xiaodan Liang
[pdf]
[DOI]

Multiscale Graph Texture Network
Ravishankar Evani*, Deepu Rajan, Shangbo Mao
[pdf]
[DOI]

HyTAS: A Hyperspectral Image Transformer Architecture Search Benchmark and Analysis
Fangqin Zhou*, Mert Kilickaya, Joaquin Vanschoren, Ran Piao
[pdf]
[DOI]

Integer-Valued Training and Spike-driven Inference Spiking Neural Network for High-performance and Energy-efficient Object Detection
Xinhao Luo, Man Yao, Yuhong Chou, Bo Xu, Guoqi Li*
[pdf]
[DOI]

RepVF: A Unified Vector Fields Representation for Multi-task 3D Perception
Jianbing Shen, Chunliang Li, Wencheng Han, Junbo Yin, Sanyuan Zhao*
[pdf]
[DOI]

Phase Concentration and Shortcut Suppression for Weakly Supervised Semantic Segmentation
Hoyong Kwon, Jaeseok Jeong, Sung-Hoon Yoon, Kuk-Jin Yoon*
[pdf]
[DOI]

Group Testing for Accurate and Efficient Range-Based Near Neighbor Search for Plagiarism Detection
Harsh Shah*, Kashish Mittal, Ajit Rajwade*
[pdf]
[DOI]

CompGS: Smaller and Faster Gaussian Splatting with Vector Quantization
K L Navaneet*, Kossar Pourahmadi Meibodi, Soroush Abbasi Koohpayegani, Hamed Pirsiavash
[pdf]
[DOI]

SMILe: Leveraging Submodular Mutual Information For Robust Few-Shot Object Detection
Anay Majee*, Ryan X Sharp, Rishabh Iyer*
[pdf]
[DOI]

Customize-A-Video: One-Shot Motion Customization of Text-to-Video Diffusion Models
Yixuan Ren*, Yang Zhou, Jimei Yang, Jing Shi, Difan Liu, Feng Liu, Mingi Kwon, Abhinav Shrivastava
[pdf]
[DOI]

S-JEPA: A Joint Embedding Predictive Architecture for Skeletal Action Recognition
Mohamed Abdelfattah*, Alexandre Alahi
[pdf]
[DOI]

∞-Brush: Controllable Large Image Synthesis with Diffusion Models in Infinite Dimensions
Minh-Quan Le*, Alexandros Graikos, Srikar Yellapragada, Rajarsi Gupta, Joel Saltz, Dimitris Samaras
[pdf]
[DOI]

SwapAnything: Enabling Arbitrary Object Swapping in Personalized Image Editing
Jing Gu*, Nanxuan Zhao, Wei Xiong, Qing Liu, Zhifei Zhang, He Zhang, Jianming Zhang, HyunJoon Jung, Yilin Wang*, Xin Eric Wang*
[pdf]
[DOI]

Interaction-centric Spatio-Temporal Context Reasoning for Multi-Person Video HOI Recognition
Yisong Wang, Nan Xi*, Jingjing Meng, Junsong Yuan
[pdf]
[DOI]

Efficient Unsupervised Visual Representation Learning with Explicit Cluster Balancing
Ioannis Maniadis Metaxas*, Georgios Tzimiropoulos, Ioannis Patras
[pdf]
[DOI]

ProTIP: Probabilistic Robustness Verification on Text-to-Image Diffusion Models against Stochastic Perturbation
Yi Zhang, Yun Tang, Wenjie Ruan, Xiaowei Huang, Siddartha Khastgir, Paul A Jennings, Xingyu Zhao*
[pdf]
[DOI]

Leveraging Near-Field Lighting for Monocular Depth Estimation from Endoscopy Videos
Akshay Paruchuri*, Samuel Ehrenstein, Shuxian Wang, Inbar Fried, Stephen Pizer, Marc Niethammer, Roni Sengupta
[pdf]
[DOI]

OvSW: Overcoming Silent Weights for Accurate Binary Neural Networks
jingyang xiang*, Zuohui Chen, Siqi Li, Qing Wu, Yong Liu
[pdf]
[DOI]

Multistain Pretraining for Slide Representation Learning in Pathology
Guillaume Jaume*, Anurag J Vaidya*, Andrew Zhang, Andrew Song, Richard J Chen, Sharifa Sahai, Dandan Mo, Emilio Madrigal, Long P Le, Faisal Mahmood*
[pdf]
[DOI]

T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy
Qing Jiang*, Feng Li, Zhaoyang Zeng, Shilong Liu, Tianhe Ren, Lei Zhang*
[pdf]
[DOI]

Harmonizing knowledge Transfer in Neural Network with Unified Distillation
yaomin huang, Faming Fang, Zaoming Yan, Chaomin Shen, Guixu Zhang*
[pdf]
[DOI]

Mamba-ND: Selective State Space Modeling for Multi-Dimensional Data
Shufan Li*, Aditya Grover, Harkanwar Singh
[pdf]
[DOI]

Click Prompt Learning with Optimal Transport for Interactive Segmentation
Jie Liu*, Haochen wang, Wenzhe Yin, Jan-Jakob Sonke, Efstratios Gavves
[pdf]
[DOI]

3D Human Pose Estimation via Non-Causal Retentive Networks
Kaili Zheng, Feixiang Lu, Yihao Lv, Liangjun Zhang, Chenyi Guo*, Ji Wu*
[pdf]
[DOI]

OMR: Occlusion-Aware Memory-Based Refinement for Video Lane Detection
Dongkwon Jin, Chang-Su Kim*
[pdf]
[DOI]

6DoF Head Pose Estimation through Explicit Bidirectional Interaction with Face Geometry
Sungho Chun, Ju Yong Chang*
[pdf]
[DOI]

Latent Diffusion Prior Enhanced Deep Unfolding for Snapshot Spectral Compressive Imaging
Zongliang Wu*, Ruiying Lu, Ying Fu, Xin Yuan
[pdf]
[DOI]

Multimodal Cross-Domain Few-Shot Learning for Egocentric Action Recognition
Masashi Hatano*, Ryo Hachiuma, Ryo Fujii, Hideo Saito
[pdf]
[DOI]

Enhancing Tampered Text Detection through Frequency Feature Fusion and Decomposition
Zhongxi Chen, Shen Chen, Taiping Yao*, Ke Sun, Shouhong Ding, Xianming Lin*, Liujuan Cao, Rongrong Ji
[pdf]
[DOI]

Modeling Label Correlations with Latent Context for Multi-Label Recognition
Zhaomin Chen*, Quan Cui, Ruoxi Deng, Jie Hu, Guodao Zhang*
[pdf]
[DOI]

LLM as Dataset Analyst: Subpopulation Structure Discovery with Large Language Model
Yulin Luo, Ruichuan An, Bocheng Zou, Yiming Tang, Jiaming Liu, Shanghang Zhang*
[pdf]
[DOI]

Finding a needle in a haystack: A Black-Box Approach to Invisible Watermark Detection
Minzhou Pan*, Zhenting Wang, Xin Dong, Vikash Sehwag, Lingjuan Lyu, Xue Lin
[pdf]
[DOI]

DynoSurf: Neural Deformation-based Temporally Consistent Dynamic Surface Reconstruction
Yuxin Yao, Siyu Ren, Junhui Hou*, Zhi Deng, Juyong Zhang, Wenping Wang
[pdf]
[DOI]

MOD-UV: Learning Mobile Object Detectors from Unlabeled Videos
Yihong Sun*, Bharath Hariharan
[pdf]
[DOI]

ARoFace: Alignment Robustness to Improve Low-quality Face Recognition
Mohammad Saeed Ebrahimi Saadabadi*, Sahar Rahimi Malakshan, Ali Dabouei, Nasser Nasrabadi
[pdf]
[DOI]

Learning Diffusion Models for Multi-View Anomaly Detection
Chieh Liu*, Yu-Min Chu*, Ting-I Hsieh*, Hwann-Tzong Chen*, Tyng-Luh Liu*
[pdf]
[DOI]

"Clearer Frames, Anytime: Resolving Velocity Ambiguity in Video Frame Interpolation"
Zhihang Zhong, Gurunandan Krishnan, Xiao Sun, Yu Qiao, Sizhuo Ma*, Jian Wang*
[pdf]
[DOI]

Multi-modal Relation Distillation for Unified 3D Representation Learning
Huiqun Wang, Yiping Bao, Panwang Pan, Zeming Li, Xiao Liu, Ruijie Yang, Di Huang*
[pdf]
[DOI]

Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization
Renjie Pi*, Tianyang Han, Wei Xiong, Jipeng ZHANG, Runtao Liu, Rui Pan, Tong Zhang
[pdf]
[DOI]

Collaborative Vision-Text Representation Optimizing for Open-Vocabulary Segmentation
Siyu Jiao*, hongguang Zhu, Yunchao Wei, Yao Zhao*, Jiannan Huang, Humphrey Shi
[pdf]
[DOI]

Distributionally Robust Loss for Long-Tailed Multi-Label Image Classification
Dekun Lin*, Zhe Cui, Rui Chen, Tailai Peng, xinran xie, Xiaolin Qin
[pdf]
[DOI]

MesonGS: Post-training Compression of 3D Gaussians via Efficient Attribute Transformation
Shuzhao Xie*, Weixiang Zhang, Chen Tang, Yunpeng Bai, Rongwei Lu, Shjia Ge, Zhi Wang
[pdf]
[DOI]

LongVLM: Efficient Long Video Understanding via Large Language Models
Yuetian Weng, Mingfei Han, Haoyu He, Xiaojun Chang, Bohan Zhuang*
[pdf]
[DOI]

The All-Seeing Project V2: Towards General Relation Comprehension of the Open World
Weiyun Wang, yiming ren, Haowen Luo, Tiantong Li, Chenxiang Yan, Zhe Chen, Wenhai Wang, Qingyun Li, Lewei Lu, Xizhou Zhu, Yu Qiao, Jifeng Dai*
[pdf]
[DOI]

Neural Metamorphosis
Xingyi Yang*, Xinchao Wang*
[pdf]
[DOI]

WHAC: World-grounded Humans and Cameras
Wanqi Yin, Zhongang Cai, Chen Wei, Fanzhou Wang, Ruisi Wang, Haiyi Mei, Weiye Xiao, Zhitao Yang, Qingping Sun, Atsushi Yamashita, Ziwei Liu, Lei Yang*
[pdf]
[DOI]

Federated Learning with Local Openset Noisy Labels
Zonglin Di*, Zhaowei Zhu, Xiaoxiao Li, Yang Liu*
[pdf]
[DOI]

Diff3DETR: Agent-based Diffusion Model for Semi-supervised 3D Object Detection
Jiacheng Deng*, Jiahao Lu, Tianzhu Zhang
[pdf]
[DOI]

PSALM: Pixelwise Segmentation with Large Multi-modal Model
Zheng Zhang, yeyao ma, Enming Zhang, Xiang Bai*
[pdf]
[DOI]

Layout-Corrector: Alleviating Layout Sticking Phenomenon in Discrete Diffusion Model
Shoma Iwai*, Atsuki Osanai, Shunsuke Kitada, Shinichiro Omachi
[pdf]
[DOI]

Active Coarse-to-Fine Segmentation of Moveable Parts from Real Images
Ruiqi Wang*, Akshay Gadi Patil, Fenggen Yu, Hao Zhang
[pdf]
[DOI]

Topo4D: Topology-Preserving Gaussian Splatting for High-Fidelity 4D Head Capture
Xuanchen Li, Yuhao Cheng, Xingyu Ren, Haozhe Jia, Di Xu, Wenhan Zhu, Yichao Yan*
[pdf]
[DOI]

Learning Modality-agnostic Representation for Semantic Segmentation from Any Modalities
Xu Zheng*, Yuanhuiyi Lyu, Lin Wang*
[pdf]
[DOI]

Kinetic Typography Diffusion Model
Seonmi Park, Inhwan Bae, Seunghyun Shin, Hae-Gon Jeon*
[pdf]
[DOI]

"Refine, Discriminate and Align: Stealing Encoders via Sample-Wise Prototypes and Multi-Relational Extraction"
Shuchi Wu*, Chuan Ma*, Kang Wei*, Xiaogang XU, Ming Ding, Yuwen Qian, Di Xiao, Tao Xiang
[pdf]
[DOI]

Light-in-Flight for a World-in-Motion
Jongho Lee*, Ryan J Suess, Mohit Gupta
[pdf]
[DOI]

GroupDiff: Diffusion-based Group Portrait Editing
Yuming Jiang, Nanxuan Zhao*, Qing Liu, Krishna Kumar Singh, Shuai Yang, Chen Change Loy, Ziwei Liu
[pdf]
[DOI]

Faceptor: A Generalist Model for Face Perception
Lixiong Qin*, Mei Wang, Xuannan Liu, Yuhang Zhang, Wei Deng, Xiaoshuai Song, Weiran Xu*, Weihong Deng
[pdf]
[DOI]

Inter-Class Topology Alignment for Efficient Black-Box Substitute Attacks
Lingzhuang Meng, Mingwen Shao*, Yuanjian Qiao, Wenjie Liu
[pdf]
[DOI]

Segment3D: Learning Fine-Grained Class-Agnostic 3D Segmentation without Manual Labels
Rui Huang, Songyou Peng, Ayca Takmaz, Federico Tombari, Marc Pollefeys, Shiji Song, Gao Huang*, Francis Engelmann
[pdf]
[DOI]

InsMapper: Exploring Inner-instance Information for Vectorized HD Mapping
zhenhua xu*, Kwan-Yee K. Wong, Hengshuang Zhao
[pdf]
[DOI]

KDProR: A Knowledge-Decoupling Probabilistic Framework for Video-Text Retrieval
Xianwei Zhuang*, Hongxiang Li, Xuxin Cheng, Zhihong Zhu, Yuxin Xie, Yuexian Zou
[pdf]
[DOI]

"Category-level Object Detection, Pose Estimation and Reconstruction from Stereo Images"
Chuanrui Zhang*, Yonggen Ling*, Minglei Lu, Minghan Qin, Haoqian Wang*
[pdf]
[DOI]

Learning with Unmasked Tokens Drives Stronger Vision Learners
Taekyung Kim*, Sanghyuk Chun, Byeongho Heo, Dongyoon Han*
[pdf]
[DOI]

Dual-stage Hyperspectral Image Classification Model with Spectral Supertoken
Peifu Liu, Tingfa Xu*, Jie Wang, Huan Chen, Huiyan Bai, Jianan Li*
[pdf]
[DOI]

Multi-Task Domain Adaptation for Language Grounding with 3D Objects
Penglei Sun, Yaoxian Song, Xinglin Pan, Peijie Dong, Xiaofei Yang, Qiang Wang*, Zhixu Li, Tiefeng Li, Xiaowen Chu*
[pdf]
[DOI]

Efficient Active Domain Adaptation for Semantic Segmentation by Selecting Information-rich Superpixels
Yuan Gao, Zilei Wang*, Yixin Zhang, Bohai Tu
[pdf]
[DOI]

Efficient Training of Spiking Neural Networks with Multi-Parallel Implicit Stream Architecture
Zhigao Cao, Meng Li, Xiashuang Wang, Haoyu Wang, Fan Wang, Youjun Li, Zigang Huang*
[pdf]
[DOI]

Camera-LiDAR Cross-modality Gait Recognition
Wenxuan Guo*, Yingping Liang, Zhiyu Pan, Ziheng Xi, Jianjiang Feng, Jie Zhou
[pdf]
[DOI]

LiteSAM is Actually what you Need for segment Everything
Jianhai Fu, Yuanjie Yu, Ningchuan Li*, Yi Zhang, Qichao Chen, Jianping Xiong, Jun Yin, Zhiyu Xiang*
[pdf]
[DOI]

IGNORE: Information Gap-based False Negative Loss Rejection for Single Positive Multi-Label Learning
Gyeong Ryeol Song, Noo-ri Kim, Jin-Seop Lee, Jee-Hyong Lee*
[pdf]
[DOI]

Visual Prompting via Partial Optimal Transport
Mengyu Zheng*, Zhiwei Hao, Yehui Tang, Chang Xu*
[pdf]
[DOI]

Modelling Competitive Behaviors in Autonomous Driving Under Generative World Model
Guanren Qiao, Guiliang Liu*, Guorui Quan, Rongxiao Qu
[pdf]
[DOI]

Tendency-driven Mutual Exclusivity for Weakly Supervised Incremental Semantic Segmentation
Chongjie Si, Xuehui Wang, Xiaokang Yang, Wei Shen*
[pdf]
[DOI]

AdaCLIP: Adapting CLIP with Hybrid Learnable Prompts for Zero-Shot Anomaly Detection
Yunkang Cao*, Jiangning Zhang, Luca Frittoli, Yuqi Cheng, Weiming Shen*, Giacomo Boracchi
[pdf]
[DOI]

Pathformer3D: A 3D Scanpath Transformer for 360° Images
Rong Quan, yantao Lai, Mengyu Qiu, Dong Liang*
[pdf]
[DOI]

TransFusion -- A Transparency-Based Diffusion Model for Anomaly Detection
Matic Fučka*, Vitjan Zavrtanik, Danijel Skočaj
[pdf]
[DOI]

SparseLIF: High-Performance Sparse LiDAR-Camera Fusion for 3D Object Detection
Hongcheng Zhang, Liu Liang, Pengxin Zeng*, Xiao Song, Zhe Wang
[pdf]
[DOI]

3D Gaussian Parametric Head Model
Yuelang Xu, Lizhen Wang, Zerong Zheng, Zhaoqi Su, Yebin Liu*
[pdf]
[DOI]

RING-NeRF : Rethinking Inductive Biases for Versatile and Efficient Neural Fields
Doriand Petit*, Steve Bourgeois, Dumitru Pavel, Vincent Gay-Bellile, Florian Chabot, Loïc Barthe
[pdf]
[DOI]

Platypus: A Generalized Specialist Model for Reading Text in Various Forms
Peng Wang, Zhaohai Li, Jun Tang, Humen Zhong, Fei Huang, Zhibo Yang*, Cong Yao*
[pdf]
[DOI]

Structured-NeRF: Hierarchical Scene Graph with Neural Representation
Zhide Zhong, Jiakai Cao, songen gu, Sirui Xie, Liyi Luo, Hao Zhao, Guyue Zhou, Haoang Li, Zike Yan*
[pdf]
[DOI]

EGIC: Enhanced Low-Bit-Rate Generative Image Compression Guided by Semantic Segmentation
Nikolai Körber*, Eduard Kromer, Andreas Siebert, Sascha Hauke, Daniel Mueller-Gritschneder, Björn Schuller
[pdf]
[DOI]

Plug-and-Play Learned Proximal Trajectory for 3D Sparse-View X-Ray Computed Tomography
Romain Vo*, Julie Escoda, Caroline Vienne, Etienne Decenciere
[pdf]
[DOI]

PPAD: Iterative Interactions of Prediction and Planning for End-to-end Autonomous Driving
Zhili Chen, Maosheng Ye, Shuangjie Xu, Tongyi Cao, Qifeng Chen*
[pdf]
[DOI]

Test-Time Stain Adaptation with Diffusion Models for Histopathology Image Classification
Cheng-Chang Tsai*, Yuan-Chih Chen, Chun-Shien Lu*
[pdf]
[DOI]

Beyond MOT: Semantic Multi-Object Tracking
Yunhao Li, Qin Li, Hao Wang, Xue Ma, Jiali Yao, Shaohua Dong, Heng Fan, Libo Zhang*
[pdf]
[DOI]

Temporal Event Stereo via Joint Learning with Stereoscopic Flow
Hoonhee Cho, Jae-Young Kang, Kuk-Jin Yoon*
[pdf]
[DOI]

SAM-COD: SAM-guided Unified Framework for Weakly-Supervised Camouflaged Object Detection
Huafeng Chen, Pengxu Wei, Guangqian Guo, Shan Gao*
[pdf]
[DOI]

Just a Hint: Point-Supervised Camouflaged Object Detection
Huafeng Chen, Dian SHAO*, Guangqian Guo, shan gao*
[pdf]
[DOI]

ManiGaussian: Dynamic Gaussian Splatting for Multi-task Robotic Manipulation
Guanxing Lu, Shiyi Zhang, Ziwei Wang*, Changliu Liu, Jiwen Lu, Yansong Tang
[pdf]
[DOI]

Global-Local Collaborative Inference with LLM for Lidar-Based Open-Vocabulary Detection
Xingyu Peng, Yan Bai, Chen Gao, Lirong Yang, Fei Xia, Beipeng Mu, Xiaofei Wang, Si Liu*
[pdf]
[DOI]

Learning High-resolution Vector Representation from Multi-Camera Images for 3D Object Detection
Zhili Chen, Shuangjie Xu, Maosheng Ye, Zian Qian, Xiaoyi Zou, Dit-Yan Yeung, Qifeng Chen*
[pdf]
[DOI]

View-Consistent 3D Editing with Gaussian Splatting
Yuxuan Wang*, Xuanyu Yi, Zike Wu, Na Zhao, Long Chen, Hanwang Zhang
[pdf]
[DOI]

E3V-K5: An Authentic Benchmark for Redefining Video-Based Energy Expenditure Estimation
Shengxuming Zhang, Lei Jin, Yifan Wang, Xinyu Wang, Xu Wen, Zunlei Feng*, Mingli Song
[pdf]
[DOI]

GeoGaussian: Geometry-aware Gaussian Splatting for Scene Rendering
Yanyan Li*, Chenyu Lyu, Yan Di, Guangyao Zhai, Gim Hee Lee, Federico Tombari
[pdf]
[DOI]

URS-NeRF: Unordered Rolling Shutter Bundle Adjustment for Neural Radiance Fields
Bo Xu*, Liu Ziao, Mengqi Guo, jiancheng Li, Gim Hee Lee
[pdf]
[DOI]

InstructIR: High-Quality Image Restoration Following Human Instructions
Marcos V. Conde*, Gregor Geigle, Radu Timofte
[pdf]
[DOI]

Asynchronous Large Language Model Enhanced Planner for Autonomous Driving
Yuan Chen, Zi-han Ding, Ziqin Wang, Yan Wang*, Lijun Zhang, Si Liu*
[pdf]
[DOI]

Make a Cheap Scaling: A Self-Cascade Diffusion Model for Higher-Resolution Adaptation
Lanqing Guo, Yingqing HE, Haoxin Chen, Menghan Xia, Xiaodong Cun, Yufei Wang, Siyu Huang, Yong Zhang, Xintao Wang, Qifeng Chen, Ying Shan, Bihan Wen*
[pdf]
[DOI]

LayoutFlow: Flow Matching for Layout Generation
Julian Jorge Andrade Guerreiro*, Naoto Inoue*, Kento Masui, Mayu Otani, Hideki Nakayama
[pdf]
[DOI]

Making Large Language Models Better Planners with Reasoning-Decision Alignment
Zhijian Huang, Tao Tang, Shaoxiang Chen, Sihao Lin, Zequn Jie, Lin Ma, Guangrun Wang, Xiaodan Liang*
[pdf]
[DOI]

R3D-AD: Reconstruction via Diffusion for 3D Anomaly Detection
Zheyuan Zhou, Le Wang, Naiyu Fang, Zili Wang, Lemiao Qiu*, Shuyou Zhang
[pdf]
[DOI]

Representation Enhancement-Stabilization: Reducing Bias-Variance of Domain Generalization
Wei Huang*, Yilei Shi, Zhitong Xiong, Xiao Xiang Zhu
[pdf]
[DOI]

Continual Learning for Remote Physiological Measurement: Minimize Forgetting and Simplify Inference
Qian Liang, Yan Chen, Yang Hu*
[pdf]
[DOI]

An Optimization Framework to Enforce Multi-View Consistency for Texturing 3D Meshes
Zhengyi Zhao, Chen Song, Xiaodong Gu, Yuan Dong, Qi Zuo, Weihao Yuan, Zilong Dong*, Liefeng Bo, Qixing Huang*
[pdf]
[DOI]

STAG4D: Spatial-Temporal Anchored Generative 4D Gaussians
Yifei Zeng, Yanqin Jiang, Siyu Zhu, Yuanxun Lu, Youtian Lin, Hao Zhu, Weiming Hu, Xun Cao, Yao Yao*
[pdf]
[DOI]

RGBD GS-ICP SLAM
Seongbo Ha, Jiung Yeon, Hyeonwoo Yu*
[pdf]
[DOI]

Efficient NeRF Optimization - Not All Samples Remain Equally Hard
Juuso Korhonen*, Goutham Rangu, Hamed Rezazadegan Tavakoli, Juho Kannala
[pdf]
[DOI]

Revisiting Calibration of Wide-Angle Radially Symmetric Cameras
Andrea Porfiri Dal Cin*, Francesco Azzoni, Giacomo Boracchi, Luca Magri*
[pdf]
[DOI]

Rawformer: Unpaired Raw-to-Raw Translation for Learnable Camera ISPs
Georgy Perevozchikov*, Nancy Mehta*, Mahmoud Afifi*, Radu Timofte*
[pdf]
[DOI]

Robust Incremental Structure-from-Motion with Hybrid Features
Shaohui Liu*, Yidan Gao, Tianyi Zhang, Rémi Pautrat, Johannes L Schönberger, Viktor Larsson, Marc Pollefeys
[pdf]
[DOI]

Revisiting Domain-Adaptive Object Detection in Adverse Weather by the Generation and Composition of High-Quality Pseudo-Labels
Rui Zhao, Huibin Yan, Shuoyao Wang*
[pdf]
[DOI]

Prediction Exposes Your Face: Black-box Model Inversion via Prediction Alignment
Yufan Liu*, Wanqian Zhang, Dayan Wu, Zheng Lin, jingzi Gu, Weiping Wang
[pdf]
[DOI]

Noise Calibration: Plug-and-play Content-Preserving Video Enhancement using Pre-trained Video Diffusion Models
Qinyu Yang, Haoxin Chen, Yong Zhang*, Menghan Xia, Xiaodong Cun, Zhixun Su*, Ying Shan
[pdf]
[DOI]

UniCal: Unified Neural Sensor Calibration
Ze Yang*, George G Chen, Haowei Zhang, Kevin Ta, Ioan Andrei Bârsan, Daniel Murphy, Sivabalan Manivasagam*, Raquel Urtasun*
[pdf]
[DOI]

Mind the Interference: Retaining Pre-trained Knowledge in Parameter Efficient Continual Learning of Vision-Language Models
Longxiang Tang*, Zhuotao Tian, Kai Li, Chunming He, Hantao Zhou, Hengshuang Zhao, Xiu Li, Jiaya Jia
[pdf]
[DOI]

Urban Waterlogging Detection: A Challenging Benchmark and Large-Small Model Co-Adapter
Suqi Song, Chenxu Zhang, Peng Zhang, Pengkun Li, Fenglong Song, Lei Zhang*
[pdf]
[DOI]

Pseudo-Embedding for Generalized Few-Shot Point Cloud Segmentation
Chih-Jung Tsai, Hwann-Tzong Chen*, Tyng-Luh Liu
[pdf]
[DOI]

WSI-VQA: Interpreting Whole Slide Images by Generative Visual Question Answering
Pingyi Chen*, Chenglu Zhu, Sunyi Zheng, Honglin Li, Lin Yang*
[pdf]
[DOI]

ReMoS: 3D Motion-Conditioned Reaction Synthesis for Two-Person Interactions
Anindita Ghosh*, Rishabh Dabral, Vladislav Golyanik, Christian Theobalt, Philipp Slusallek
[pdf]
[DOI]

Statewide Visual Geolocalization in the Wild
Florian Fervers*, Sebastian Bullinger, Christoph Bodensteiner, Michael Arens, Rainer Stiefelhagen
[pdf]
[DOI]

Any2Point: Empowering Any-modality Transformers for Efficient 3D Understanding
Yiwen Tang, Ray Zhang, Jiaming Liu, Zoey Guo, Bin Zhao*, Zhigang Wang, Dong Wang*, Peng Gao, Hongsheng Li, Xuelong Li
[pdf]
[DOI]

Trajectory-aligned Space-time Tokens for Few-shot Action Recognition
Pulkit Kumar*, Namitha Padmanabhan, Luke Luo, Sai Saketh Rambhatla, Abhinav Shrivastava
[pdf]
[DOI]

EgoCVR: An Egocentric Benchmark for Fine-Grained Composed Video Retrieval
Thomas Hummel*, Shyamgopal Karthik, Mariana-Iuliana Georgescu, Zeynep Akata
[pdf]
[DOI]

Synchronization of Projective Transformations
Rakshith Madhavan*, Andrea Fusiello, Federica Arrigoni
[pdf]
[DOI]

TLControl: Trajectory and Language Control for Human Motion Synthesis
Weilin Wan*, Zhiyang Dou, Taku Komura, Wenping Wang, Dinesh Jayaraman, Lingjie Liu
[pdf]
[DOI]

Insect Identification in the Wild: The AMI Dataset
Aditya Jain*, Fagner Cunha, Michael J Bunsen, Juan Sebastián Cañas, Léonard Pasi, Nathan Pinoy, Flemming Helsing, JoAnne Russo, Marc S Botham, Michael Sabourin, Jonathan Fréchette, Alexandre Anctil, Yacksecari Lopez, Eduardo Navarro, Filonila Pérez, Ana C Zamora, Jose Alejandro Ramirez-Silva, Jonathan Gagnon, Tom A August, Kim Bjerge, Alba Gomez Segura, Marc Belisle, Yves Basset, Kent P McFarland, David B Roy, Toke T Høye, Maxim Larrivee, David Rolnick
[pdf]
[DOI]

Cross-view image geo-localization with Panorama-BEV Co-Retrieval Network
Junyan Ye, Zhutao Lv, Weijia Li*, Jinhua Yu, Haote Yang, Huaping Zhong, Conghui He*
[pdf]
[DOI]

F-HOI: Toward Fine-grained Semantic-Aligned 3D Human-Object Interactions
Jie Yang, Xuesong Niu, Nan Jiang, Ruimao Zhang*, Siyuan Huang*
[pdf]
[DOI]

Test-time Model Adaptation for Image Reconstruction Using Self-supervised Adaptive Layers
Yutian Zhao, Tianjing Zhang, Hui Ji*
[pdf]
[DOI]

SHIC: Shape-Image Correspondences with no Keypoint Supervision
Aleksandar Shtedritski*, Christian Rupprecht, Andrea Vedaldi
[pdf]
[DOI]

GenRC: Generative 3D Room Completion from Sparse Image Collections
Ming-Feng Li*, Yueh-Feng Ku, Hong-Xuan Yen, Chi Liu, Yu-Lun Liu, Albert Y Chen, Cheng-Hao Kuo, Min Sun
[pdf]
[DOI]

A Probability-guided Sampler for Neural Implicit Surface Rendering
Gonçalo José Dias Pais, Valter André Piedade, Moitreya Chatterjee, Marcus Greiff, Pedro Miraldo*
[pdf]
[DOI]

ReMatching: Low-Resolution Representations for Scalable Shape Correspondence
Filippo Maggioli*, Daniele Baieri, Emanuele Rodola, Simone Melzi
[pdf]
[DOI]

Where am I? Scene Retrieval with Language
Jiaqi Chen*, Daniel Barath, Iro Armeni, Marc Pollefeys, Hermann Blum
[pdf]
[DOI]

This Probably Looks Exactly Like That: An Invertible Prototypical Network
Zachariah Carmichael*, Timothy P Redgrave, Daniel Gonzalez Cedre, Walter Scheirer
[pdf]
[DOI]

Arc2Face: A Foundation Model for ID-Consistent Human Faces
Foivos Paraperas Papantoniou*, Alexandros Lattas, Stylianos Moschoglou, Jiankang Deng, Bernhard Kainz, Stefanos Zafeiriou
[pdf]
[DOI]

PhysAvatar: Learning the Physics of Dressed 3D Avatars from Visual Observations
Yang Zheng*, Qingqing Zhao, Guandao Yang, Wang Yifan, Donglai Xiang, Florian Dubost, Dmitry Lagun, Thabo Beeler, Federico Tombari, Leonidas Guibas, Gordon Wetzstein
[pdf]
[DOI]

Revisiting Feature Disentanglement Strategy in Diffusion Training and Breaking Conditional Independence Assumption in Sampling
Wonwoong Cho*, Hareesh Ravi*, Midhun Harikumar, Vinh Khuc, Krishna Kumar Singh, Jingwan Lu, David Iseri Inouye*, Ajinkya Kale*
[pdf]
[DOI]

SweepNet: Unsupervised Learning Shape Abstraction via Neural Sweepers
Mingrui Zhao*, Yizhi Wang, Fenggen Yu, Changqing Zou, Ali Mahdavi-Amiri
[pdf]
[DOI]

Leveraging Thermal Modality to Enhance Reconstruction in Low-Light Conditions
Jiacong Xu*, Mingqian Liao, Ram Prabhakar Kathirvel, Vishal Patel
[pdf]
[DOI]

On the Viability of Monocular Depth Pre-training for Semantic Segmentation
Dong Lao*, Fengyu Yang, Daniel Wang, Hyoungseob Park, Samuel Lu, Alex Wong, Stefano Soatto
[pdf]
[DOI]

Fairness-aware Vision Transformer via Debiased Self-Attention
Yao Qiang, Chengyin Li, Prashant Khanduri, Dongxiao Zhu*
[pdf]
[DOI]

EgoPet: Egomotion and Interaction Data from an Animal's Perspective
Amir Bar*, Arya Bakhtiar, Danny L Tran, Antonio Loquercio, Jathushan Rajasegaran, yann lecun, Amir Globerson, Trevor Darrell
[pdf]
[DOI]

Deep Companion Learning: Enhancing Generalization Through Historical Consistency
Ruizhao Zhu*, Venkatesh Saligrama*
[pdf]
[DOI]

Neural graphics texture compression supporting random access
Farzad Farhadzadeh*, Qiqi Hou, Hoang Le, Amir Said, Randall R Rauwendaal, Alex Bourd, Fatih Porikli
[pdf]
[DOI]

Contrastive Learning with Synthetic Positives
Dewen Zeng*, Xinrong Hu, Yawen Wu, Xiaowei Xu, Yiyu Shi
[pdf]
[DOI]

GeneralAD: Anomaly Detection Across Domains by Attending to Distorted Features
Luc P.J. Sträter*, Mohammadreza Salehi, Efstratios Gavves, Cees G.M. Snoek, Yuki M. Asano
[pdf]
[DOI]

Interpretability-Guided Test-Time Adversarial Defense
Akshay Kulkarni*, Tsui-Wei Weng
[pdf]
[DOI]

DIM: Dyadic Interaction Modeling for Social Behavior Generation
Minh Tran*, Di Chang, Maksim Siniukov, Mohammad Soleymani
[pdf]
[DOI]

Tri^{2}-plane: Thinking Head Avatar via Feature Pyramid
Luchuan Song*, Pinxin Liu, Lele Chen, Guojun Yin, Chenliang Xu
[pdf]
[DOI]

ControlCap: Controllable Region-level Captioning
Yuzhong Zhao, Liu Yue, Zonghao Guo, weijia wu, Chen Gong, Qixiang Ye, Fang Wan*
[pdf]
[DOI]

Free Lunch for Gait Recognition: A Novel Relation Descriptor
Jilong Wang*, Saihui Hou, Yan Huang, Chunshui Cao, Xu Liu, Yongzhen Huang, Tianzhu Zhang, Liang Wang*
[pdf]
[DOI]

SegVG: Transferring Object Bounding Box to Segmentation for Visual Grounding
Weitai Kang*, Gaowen Liu, Mubarak Shah, Yan Yan
[pdf]
[DOI]

Adaptive Correspondence Scoring for Unsupervised Medical Image Registration
Xiaoran Zhang*, John C. Stendahl, Lawrence H. Staib, Albert J. Sinusas, Alex Wong, James S. Duncan
[pdf]
[DOI]

MaxFusion: Plug&Play Multi-Modal Generation in Text-to-Image Diffusion Models
Nithin Gopalakrishnan Nair*, Jeya Maria Jose Valanarasu, Vishal Patel
[pdf]
[DOI]

Watch Your Steps: Local Image and Scene Editing by Text Instructions
Ashkan Mirzaei*, Tristan T Aumentado-Armstrong, Marcus A Brubaker, Jonathan Kelly, Alex Levinshtein, Konstantinos G Derpanis, Igor Gilitschenski
[pdf]
[DOI]

Forget More to Learn More: Domain-specific Feature Unlearning for Semi-supervised and Unsupervised Domain Adaptation
Hritam Basak*, Zhaozheng Yin
[pdf]
[DOI]

3x2: 3D Object Part Segmentation by 2D Semantic Correspondences
Anh Thai*, Weiyao Wang, Hao Tang, Stefan Stojanov, James M Rehg, Matt Feiszli
[pdf]
[DOI]

Idea2Img: Iterative Self-Refinement with GPT-4V for Automatic Image Design and Generation
Zhengyuan Yang*, Jianfeng Wang, Linjie Li, Kevin Lin, Chung-Ching Lin, Zicheng Liu, Lijuan Wang
[pdf]
[DOI]

Human-in-the-Loop Visual Re-ID for Population Size Estimation
Gustavo Perez*, Daniel Sheldon, Grant Van Horn, Subhransu Maji
[pdf]
[DOI]

SEGIC: Unleashing the Emergent Correspondence for In-Context Segmentation
Lingchen Meng, Shiyi Lan, Hengduo Li, Jose M Alvarez, Zuxuan Wu*, Yu-Gang Jiang
[pdf]
[DOI]

"PointNeRF++: A multi-scale, point-based Neural Radiance Field"
Weiwei Sun, Eduard Trulls, Yang-Che Tseng, Sneha Sambandam, Gopal Sharma, Andrea Tagliasacchi, Kwang Moo Yi*
[pdf]
[DOI]

A Semantic Space is Worth 256 Language Descriptions: Make Stronger Segmentation Models with Descriptive Properties
Junfei Xiao, Ziqi Zhou, Wenxuan Li, Shiyi Lan, Jieru Mei, Zhiding Yu, Bingchen Zhao, Alan Yuille, Yuyin Zhou, Cihang Xie*
[pdf]
[DOI]

UMG-CLIP: A Unified Multi-Granularity Vision Generalist for Open-World Understanding
Bowen Shi, Peisen Zhao, Zichen Wang, Yuhang Zhang, Yaoming Wang, Jin Li, Wenrui Dai, Junni Zou, Hongkai Xiong, Qi Tian, Xiaopeng Zhang*
[pdf]
[DOI]

Fast View Synthesis of Casual Videos with Soup-of-Planes
Yao-Chih Lee*, Zhoutong Zhang, Kevin Blackburn-Matzen, Simon Niklaus, Jianming Zhang, Jia-Bin Huang, Feng Liu*
[pdf]
[DOI]

Adaptive Human Trajectory Prediction via Latent Corridors
Neerja Thakkar*, Karttikeya Mangalam, Andrea Bajcsy, Jitendra Malik
[pdf]
[DOI]

Video Question Answering with Procedural Programs
Rohan Choudhury*, Koichiro Niinuma, Kris Kitani, Laszlo A Jeni
[pdf]
[DOI]

DGR-MIL: Exploring Diverse Global Representation in Multiple Instance Learning for Whole Slide Image Classification
Wenhui Zhu*, Xiwen Chen, Peijie Qiu, Aristeidis Sotiras, Abolfazl Razi, Yalin Wang
[pdf]
[DOI]

TexGen: Text-Guided 3D Texture Generation with Multi-view Sampling and Resampling
Dong Huo*, Zixin Guo, Xinxin Zuo, Zhihao Shi, Juwei Lu, Peng Dai, Songcen Xu, Li Cheng, Yee-Hong Yang
[pdf]
[DOI]

C2C: Component-to-Composition Learning for Zero-Shot Compositional Action Recognition
Rongchang Li, Zhenhua Feng, Tianyang Xu, Linze Li, Xiao-Jun Wu*, Muhammad Awais, Sara Atito, Josef Kittler
[pdf]
[DOI]

LLMGA: Multimodal Large Language Model based Generation Assistant
bin xia*, Shiyin Wang, Yingfan Tao, Yitong Wang, Jiaya Jia
[pdf]
[DOI]

Put Myself in Your Shoes: Lifting the Egocentric Perspective from Exocentric Videos
Mi Luo*, Zihui Xue, Alex Dimakis, Kristen Grauman
[pdf]
[DOI]

Shape from Heat Conduction
Sriram Narayanan*, Mani Ramanagopal, Mark Sheinin, Aswin C. Sankaranarayanan, Srinivasa G. Narasimhan
[pdf]
[DOI]

An Adaptive Screen-Space Meshing Approach for Normal Integration
Moritz Heep*, Eduard Zell
[pdf]
[DOI]

Parrot: Pareto-optimal Multi-Reward Reinforcement Learning Framework for Text-to-Image Generation
Seung Hyun Lee*, Yinxiao Li, Junjie Ke, Innfarn Yoo, Han Zhang, Jiahui Yu, Qifei Wang, Fei Deng, Glenn Entis, Junfeng He, Gang Li, Sangpil Kim, Irfan Essa, Feng Yang*
[pdf]
[DOI]

HandDGP: Camera-Space Hand Mesh Prediction with Differentiable Global Positioning
Eugene Valassakis, Guillermo Garcia-Hernando*
[pdf]
[DOI]

Towards Latent Masked Image Modeling for Self-Supervised Visual Representation Learning
Yibing Wei*, Abhinav Gupta, Pedro Morgado*
[pdf]
[DOI]

Nuvo: Neural UV Mapping for Unruly 3D Representations
Pratul Srinivasan*, Stephan J Garbin, Dor Verbin, Jonathan T Barron, Ben Mildenhall
[pdf]
[DOI]

Towards High-Quality 3D Motion Transfer with Realistic Apparel Animation
Rong Wang*, Wei Mao, Changsheng Lu, HONGDONG LI
[pdf]
[DOI]

AnyHome: Open-Vocabulary Large-Scale Indoor Scene Generation with First-Person View Exploration
Rao Fu*, Zehao Wen, Zichen Liu , Srinath Sridhar
[pdf]
[DOI]

Better Call SAL: Towards Learning to Segment Anything in Lidar
Aljosa Osep*, Tim Meinhardt, Francesco Ferroni, Neehar Peri, Deva Ramanan, Laura Leal-Taixé
[pdf]
[DOI]

DGInStyle: Domain-Generalizable Semantic Segmentation with Image Diffusion Models and Stylized Semantic Control
Yuru Jia, Lukas Hoyer, Shengyu Huang, Tianfu Wang, Luc Van Gool, Konrad Schindler, Anton Obukhov*
[pdf]
[DOI]

"DECOLLAGE: 3D Detailization by Controllable, Localized, and Learned Geometry Enhancement"
Qimin Chen*, Zhiqin Chen, Vladimir G. Kim, Noam Aigerman, Hao Zhang, Siddhartha Chaudhuri
[pdf]
[DOI]

Scene-aware Human Motion Forecasting via Mutual Distance Prediction
Chaoyue Xing*, Wei Mao, Miaomiao Liu
[pdf]
[DOI]

FSGS: Real-Time Few-shot View Synthesis using Gaussian Splatting
Zehao Zhu, Zhiwen Fan*, Yifan Jiang, Zhangyang Wang*
[pdf]
[DOI]

Open Panoramic Segmentation
Junwei Zheng, Ruiping Liu, Yufan Chen, Kunyu Peng, Chengzhi Wu, Kailun Yang, Jiaming Zhang*, Rainer Stiefelhagen
[pdf]
[DOI]

iMatching: Imperative Correspondence Learning
Zitong Zhan*, Dasong Gao, Yun-Jou Lin, Youjie Xia, Chen Wang*
[pdf]
[DOI]

COSMU: Complete 3D human shape from monocular unconstrained images
Marco Pesavento*, Marco Volino, Adrian Hilton
[pdf]
[DOI]

MAP-ADAPT: Real-Time Quality-Adaptive Semantic 3D Maps
Jianhao Zheng*, Daniel Barath, Marc Pollefeys, Iro Armeni*
[pdf]
[DOI]

Appearance-based Refinement for Object-Centric Motion Segmentation
Junyu Xie*, Weidi Xie, Andrew Zisserman
[pdf]
[DOI]

SemiVL: Semi-Supervised Semantic Segmentation with Vision-Language Guidance
Lukas Hoyer*, David Joseph Tan, Muhammad Ferjad Naeem, Luc Van Gool, Federico Tombari
[pdf]
[DOI]

Open Vocabulary Multi-Label Video Classification
Rohit Gupta*, Mamshad Nayeem Rizve, Jayakrishnan Unnikrishnan, Ashish Tawari, Son Tran, Mubarak Shah, Benjamin Yao, Trishul A Chilimbi
[pdf]
[DOI]

Optimal Transport of Diverse Unsupervised Tasks for Robust Learning from Noisy Few-Shot Data
Xiaofan Que, Qi Yu*
[pdf]
[DOI]

Regularizing Dynamic Radiance Fields with Kinematic Fields
Woobin Im, Geonho Cha, Sebin Lee, Jumin Lee, Juhyeong Seon, Dongyoon Wee, Sungeui Yoon*
[pdf]
[DOI]

MICDrop: Masking Image and Depth Features via Complementary Dropout for Domain-Adaptive Semantic Segmentation
Linyan Yang*, Lukas Hoyer*, Mark Weber, Tobias Fischer, Dengxin Dai, Laura Leal-Taixé, Daniel Cremers, Marc Pollefeys, Luc Van Gool
[pdf]
[DOI]

Efficient Pre-training for Localized Instruction Generation of Procedural Videos
Anil Batra*, Davide Moltisanti, Laura Sevilla-Lara, Marcus Rohrbach, Frank Keller
[pdf]
[DOI]

MTKD: Multi-Teacher Knowledge Distillation for Image Super-Resolution
Yuxuan Jiang*, Chen Feng, Fan Zhang, David Bull
[pdf]
[DOI]

DEAL: Disentangle and Localize Concept-level Explanations for VLMs
Tang Li*, Mengmeng Ma, Xi Peng
[pdf]
[DOI]

Fast Encoding and Decoding for Implicit Video Representation
Hao Chen*, Saining Xie, Ser-Nam Lim, Abhinav Shrivastava
[pdf]
[DOI]

Surf-D: Generating High-Quality Surfaces of Arbitrary Topologies Using Diffusion Models
Zhengming Yu*, Zhiyang Dou, Xiaoxiao Long, Cheng Lin, Zekun Li, Yuan Liu, Norman Müller, Taku Komura, Marc Habermann, Christian Theobalt, Xin Li, Wenping Wang*
[pdf]
[DOI]

Diffusion-Refined VQA Annotations for Semi-Supervised Gaze Following
Qiaomu Miao*, Alexandros Graikos, Jingwei Zhang, Sounak Mondal, Minh Hoai, Dimitris Samaras
[pdf]
[DOI]

IMMA: Immunizing text-to-image Models against Malicious Adaptation
Amber Yijia Zheng*, Raymond A. Yeh
[pdf]
[DOI]

Motion-Oriented Compositional Neural Radiance Fields for Monocular Dynamic Human Modeling
Jaehyeok Kim, Dongyoon Wee, Dan Xu*
[pdf]
[DOI]

GeoCalib: Learning Single-image Calibration with Geometric Optimization
Alexander Veicht*, Paul-Edouard Sarlin*, Philipp Lindenberger, Marc Pollefeys
[pdf]
[DOI]

3D Open-Vocabulary Panoptic Segmentation with 2D-3D Vision-Language Distillation
Zihao Xiao*, Longlong Jing, Shangxuan Wu, Alex Zihao Zhu, Jingwei Ji, Chiyu Max Jiang, Wei-Chih Hung, Thomas Funkhouser, Weicheng Kuo, Anelia Angelova, Yin Zhou, Shiwei Sheng
[pdf]
[DOI]

Semicalibrated Relative Pose from an Affine Correspondence and Monodepth
Petr Hruby*, Marc Pollefeys, Daniel Barath
[pdf]
[DOI]

Global Structure-from-Motion Revisited
Linfei Pan*, Daniel Barath, Marc Pollefeys, Johannes L Schönberger
[pdf]
[DOI]

MobileNetV4: Universal Models for the Mobile Ecosystem
Danfeng Qin*, Chas H Leichner, Manolis Delakis, Marco Fornoni, Shixin Luo, Fan Yang, Weijun Wang, Colby Banbury, Chengxi Ye, Berkin Akin, Vaibhav Aggarwal, Tenghui Zhu, Daniele Moro, Andrew Howard
[pdf]
[DOI]

Gravity-aligned Rotation Averaging with Circular Regression
Linfei Pan*, Marc Pollefeys, Daniel Barath
[pdf]
[DOI]

MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation
Kunpeng Song*, Yizhe Zhu*, Bingchen Liu*, Qing Yan*, Ahmed Elgammal*, Xiao Yang*
[pdf]
[DOI]

Find n' Propagate: Open-Vocabulary 3D Object Detection in Urban Environments
Djamahl Etchegaray*, Zi Helen Huang, Tatsuya Harada, Yadan Luo
[pdf]
[DOI]

Quanta Video Restoration
Prateek Chennuri*, Yiheng Chi, Enze Jiang, GM Dilshan Godaliyadda*, Abhiram Gnanasambandam*, Hamid R Sheikh, Istvan Gyongy, Stanley H Chan*
[pdf]
[DOI]

Concept Sliders: LoRA Adaptors for Precise Control in Diffusion Models
Rohit Gandikota*, Joanna Materzynska, Tingrui Zhou, Antonio Torralba, David Bau
[pdf]
[DOI]

CAT-SAM: Conditional Tuning for Few-Shot Adaptation of Segment Anything Model
Aoran Xiao, Weihao Xuan, Heli Qi, Yun Xing, Ruijie Ren, Xiaoqin Zhang, Ling Shao, Shijian Lu*
[pdf]
[DOI]

ScribblePrompt: Fast and Flexible Interactive Segmentation for Any Biomedical Image
Hallee E. Wong*, Marianne Rakic, John Guttag, Adrian V. Dalca
[pdf]
[DOI]

POCA: Post-training Quantization with Temporal Alignment for Codec Avatars
Jian Meng*, Yuecheng Li*, Leo (Chenghui) Li, Syed Shakib Sarwar, Dilin Wang, Jae-sun Seo*
[pdf]
[DOI]

HYPE: Hyperbolic Entailment Filtering for Underspecified Images and Texts
Wonjae Kim*, Sanghyuk Chun, Taekyung Kim, Dongyoon Han, Sangdoo Yun
[pdf]
[DOI]

Finding Meaning in Points: Weakly Supervised Semantic Segmentation for Event Cameras
Hoonhee Cho, Sung-Hoon Yoon, Hyeokjun Kweon, Kuk-Jin Yoon*
[pdf]
[DOI]

Unsupervised Dense Prediction using Differentiable Normalized Cuts
Yanbin Liu*, Stephen Gould
[pdf]
[DOI]

Boosting the Power of Small Multimodal Reasoning Models to Match Larger Models with Self-Consistency Training
Cheng Tan*, Jingxuan Wei*, Zhangyang Gao, Linzhuang Sun, Siyuan Li, Ruifeng Guo, BiHui Yu, Stan Z. Li*
[pdf]
[DOI]

Scaling Up Personalized Image Aesthetic Assessment via Task Vector Customization
Jooyeol Yun*, Jaegul Choo
[pdf]
[DOI]

AutoDIR: Automatic All-in-One Image Restoration with Latent Diffusion
Yitong Jiang*, Zhaoyang Zhang, Tianfan Xue, Jinwei Gu*
[pdf]
[DOI]

Receler: Reliable Concept Erasing of Text-to-Image Diffusion Models via Lightweight Erasers
Chi-Pin Huang*, Kai-Po Chang, Chung-Ting Tsai, Yung-Hsuan Lai, Fu-En Yang, Yu-Chiang Frank Wang
[pdf]
[DOI]

EINet: Point Cloud Completion via Extrapolation and Interpolation
Pingping Cai*, Canyu Zhang, LINGJIA SHI, Lili Wang, Nasrin Imanpour, Song Wang
[pdf]
[DOI]

Personalized Video Relighting With an At-Home Light Stage
Jun Myeong Choi*, Max Christman, Roni Sengupta
[pdf]
[DOI]

Temporal Residual Guided Diffusion Framework for Event-Driven Video Reconstruction
Lin Zhu*, Yunlong Zheng, Yijun Zhang, Xiao Wang, Lizhi Wang, Hua Huang
[pdf]
[DOI]

A Secure Image Watermarking Framework with Statistical Guarantees via Adversarial Attacks on Secret Key Networks
Feiyu CHEN*, Wei Lin, Ziquan Liu, Antoni Chan
[pdf]
[DOI]

SPIRE: Semantic Prompt-Driven Image Restoration
Chenyang QI*, Zhengzhong Tu, Keren Ye, Mauricio Delbracio, Peyman Milanfar, Qifeng Chen, Hossein Talebi
[pdf]
[DOI]

Free-ATM: Harnessing Free Attention Masks for Representation Learning on Diffusion-Generated Images
David Junhao Zhang*, Mutian Xu, Jay Zhangjie Wu, Chuhui Xue, Wenqing Zhang, Xiaoguang Han, Song Bai, Mike Zheng Shou*
[pdf]
[DOI]

HiT-SR: Hierarchical Transformer for Efficient Image Super-Resolution
XIANG ZHANG*, Yulun Zhang, Fisher Yu
[pdf]
[DOI]

Audio-Synchronized Visual Animation
Lin Zhang, Shentong Mo, Yijing Zhang, Pedro Morgado*
[pdf]
[DOI]

Expressive Whole-Body 3D Gaussian Avatar
Gyeongsik Moon*, Takaaki Shiratori, Shunsuke Saito
[pdf]
[DOI]

Canonical Shape Projection is All You Need for 3D Few-shot Class Incremental Learning
Ali Cheraghian*, Zeeshan Hayder, Sameeea Ramasinghe, Shafin Rahman, Javad Jafaryahya, Lars Petersson, Mehrtash Harandi
[pdf]
[DOI]

Controllable Human-Object Interaction Synthesis
Jiaman Li*, Alexander Clegg, Roozbeh Mottaghi, Jiajun Wu, Xavier Puig, C. Karen Liu
[pdf]
[DOI]

High-Fidelity and Transferable NeRF Editing by Frequency Decomposition
Yisheng He*, Weihao Yuan*, Siyu Zhu, Zilong Dong, Liefeng Bo, Qixing Huang
[pdf]
[DOI]

DoughNet: A Visual Predictive Model for Topological Manipulation of Deformable Objects
Dominik Bauer*, Zhenjia Xu, Shuran Song
[pdf]
[DOI]

PAV: Personalized Head Avatar from Unstructured Video Collection
Akin Caliskan*, Berkay Kicanaoglu, Hyeongwoo Kim
[pdf]
[DOI]

Strike a Balance in Continual Panoptic Segmentation
Jinpeng Chen, Runmin Cong*, Yuxuan Luo, Horace Ho Shing Ip, Sam Kwong*
[pdf]
[DOI]

In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic Segmentation
Dahyun Kang, Minsu Cho*
[pdf]
[DOI]

MultiDelete for Multimodal Machine Unlearning
Jiali Cheng*, Hadi Amiri
[pdf]
[DOI]

Unified Local-Cloud Decision-Making via Reinforcement Learning
Kathakoli Sengupta, Zhongkai Shangguan, Sandesh Bharadwaj, Sanjay Arora, Eshed Ohn-Bar*, Renato Mancuso
[pdf]
[DOI]

UniTalker: Scaling up Audio-Driven 3D Facial Animation through A Unified Model
Xiangyu Fan*, Jiaqi Li, Zhiqian Lin, Weiye Xiao, Lei Yang*
[pdf]
[DOI]

Robo-ABC: Affordance Generalization Beyond Categories via Semantic Correspondence for Robot Manipulation
Yuanchen Ju, Kaizhe Hu, Guowei Zhang, Gu Zhang, Mingrun Jiang, Huazhe Xu*
[pdf]
[DOI]

Efficient Frequency-Domain Image Deraining with Contrastive Regularization
Ning Gao, Xingyu Jiang, Xiuhui Zhang, Yue Deng*
[pdf]
[DOI]

Stitched ViTs are Flexible Vision Backbones
Zizheng Pan*, Jing Liu, Haoyu He, Jianfei Cai, Bohan Zhuang*
[pdf]
[DOI]

TrajPrompt: Aligning Color Trajectory with Vision-Language Representations
Li-Wu Tsao*, Hao-Tang Tsui, Yu-Rou Tuan, Pei-Chi Chen, Kuan-Lin Wang, Jhih-Ciang Wu, Hong-Han Shuai*, Wen-Huang Cheng
[pdf]
[DOI]

SemReg: Semantics Constrained Point Cloud Registration
Sheldon Fung, Xuequan Lu*, Dasith de Silva Edirimuni, Wei Pan, Xiao Liu, HONGDONG LI
[pdf]
[DOI]

Cascade-Zero123: One Image to Highly Consistent 3D with Self-Prompted Nearby Views
Yabo Chen, Jiemin Fang, Yuyang Huang, Taoran Yi, Xiaopeng Zhang*, Lingxi Xie, Xinggang Wang, Wenrui Dai*, Hongkai Xiong, Qi Tian
[pdf]
[DOI]

RoScenes: A Large-scale Multi-view 3D Dataset for Roadside Perception
Xiaosu Zhu, Hualian Sheng, Sijia Cai, Bing Deng, Shaopeng Yang, Qiao Liang, Ken Chen, Lianli Gao, Jingkuan Song*, Jieping Ye*
[pdf]
[DOI]

ReSyncer: Rewiring Style-based Generator for Unified Audio-Visually Synced Facial Performer
Jiazhi Guan*, Zhiliang Xu, Hang Zhou, Kaisiyuan Wang, Shengyi He, Zhanwang Zhang, Borong Liang, Haocheng Feng, Errui Ding, Jingtuo Liu, Jingdong Wang, Youjian Zhao, Ziwei Liu
[pdf]
[DOI]

Language-Driven Physics-Based Scene Synthesis and Editing via Feature Splatting
Ri-Zhao Qiu*, Ge Yang, Weijia Zeng, Xiaolong Wang
[pdf]
[DOI]

AlignDiff: Aligning Diffusion Models for General Few-Shot Segmentation
Ri-Zhao Qiu*, Yu-Xiong Wang, Kris Hauser
[pdf]
[DOI]

SkateFormer: Skeletal-Temporal Transformer for Human Action Recognition
Jeonghyeok Do, Munchurl Kim*
[pdf]
[DOI]

R^2-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding
Ye Liu, Jixuan He, Wanhua Li*, Junsik Kim, Donglai Wei, Hanspeter Pfister, Chang Wen Chen*
[pdf]
[DOI]

Tree-D Fusion: Simulation-Ready Tree Dataset from Single Images with Diffusion Priors
Jae Joong Lee, Bosheng Li, Sara M Beery, Jonathan Huang, Songlin Fei, Raymond A. Yeh, Bedrich Benes*
[pdf]
[DOI]

Parameterization-driven Neural Surface Reconstruction for Object-oriented Editing in Neural Rendering
Baixin Xu, Jiangbei Hu, Fei Hou, Kwan-Yee Lin, Wayne Wu, Chen Qian, Ying He*
[pdf]
[DOI]

DomainFusion: Generalizing To Unseen Domains with Latent Diffusion Models
Yuyang Huang, Yabo Chen, Yuchen Liu, xiaopeng zhang*, Wenrui Dai*, Hongkai Xiong, Qi Tian
[pdf]
[DOI]

Open-Set Recognition in the Age of Vision-Language Models
Dimity Miller*, Niko Suenderhauf, Alex Kenna, Keita Mason
[pdf]
[DOI]

Unsqueeze [CLS] Bottleneck to Learn Rich Representations
Qing Su*, Shihao Ji
[pdf]
[DOI]

Robust Multimodal Learning via Representation Decoupling
Shicai Wei, Yang Luo, Yuji Wang, Chunbo Luo*
[pdf]
[DOI]

Object-Conditioned Energy-Based Attention Map Alignment in Text-to-Image Diffusion Models
Yasi Zhang*, Peiyu Yu, Ying Nian Wu
[pdf]
[DOI]

WiMANS: A Benchmark Dataset for WiFi-based Multi-user Activity Sensing
Shuokang Huang*, Kaihan Li, Di You, Yichong Chen, Arvin Lin, Siying Liu, Xiaohui Li, Julie A. McCann*
[pdf]
[DOI]

Embedding-Free Transformer with Inference Spatial Reduction for Efficient Semantic Segmentation
Hyunwoo Yu, Yubin Cho, Beoungwoo Kang, Seunghun Moon, Kyeongbo Kong, Suk-Ju Kang*
[pdf]
[DOI]

VeCLIP: Improving CLIP Training via Visual-enriched Captions
Zhengfeng Lai*, Haotian Zhang, Bowen Zhang, Wentao Wu, Haoping Bai, Aleksei Timofeev, Xianzhi Du, Zhe Gan, Jiulong Shan, Chen-Nee Chuah, Yinfei Yang, Meng Cao
[pdf]
[DOI]

Three Things We Need to Know About Transferring Stable Diffusion to Visual Dense Prediciton Tasks
Manyuan Zhang*, Guanglu Song, Xiaoyu Shi, Yu Liu, Hongsheng Li
[pdf]
[DOI]

Learning Representations from Foundation Models for Domain Generalized Stereo Matching
Yongjian Zhang, Longguang Wang, Kunhong Li, WANG Yun, Yulan Guo*
[pdf]
[DOI]

Spike-Temporal Latent Representation for Energy-Efficient Event-to-Video Reconstruction
Jianxiong Tang*, Jian-Huang Lai*, Lingxiao Yang, Xiaohua Xie
[pdf]
[DOI]

Effective Lymph Nodes Detection in CT Scans Using Location Debiased Query Selection and Contrastive Query Representation in Transformer
Qinji Yu*, Yirui Wang*, Ke Yan, Haoshen Li, Dazhou Guo, Li Zhang, Na Shen, Qifeng Wang, Xiaowei Ding, Le Lu, Xianghua Ye*, Dakai Jin*
[pdf]
[DOI]

Chat-Edit-3D: Interactive 3D Scene Editing via Text Prompts
shuangkang fang*, Yufeng Wang*, Yi-Hsuan Tsai, Yi Yang, Wenrui Ding, Shuchang Zhou, Ming-Hsuan Yang
[pdf]
[DOI]

Event-Adapted Video Super-Resolution
Zeyu Xiao, Dachun Kai, Yueyi Zhang, Zheng-Jun Zha, Xiaoyan Sun, Zhiwei Xiong*
[pdf]
[DOI]

Look Hear: Gaze Prediction for Speech-directed Human Attention
Sounak Mondal*, Seoyoung Ahn, Zhibo Yang, Niranjan Balasubramanian, Dimitris Samaras, Gregory Zelinsky, Minh Hoai
[pdf]
[DOI]

Raising the Ceiling: Conflict-Free Local Feature Matching with Dynamic View Switching
Xiaoyong Lu*, Songlin Du*
[pdf]
[DOI]

Q&A Prompts: Discovering Rich Visual Clues through Mining Question-Answer Prompts for VQA requiring Diverse World Knowledge
Haibo Wang*, Weifeng Ge*
[pdf]
[DOI]

Catastrophic Overfitting: A Potential Blessing in Disguise
MN Zhao, Lihe Zhang*, Yuqiu Kong, Baocai Yin
[pdf]
[DOI]

Long-range Turbulence Mitigation: A Large-scale Dataset and A Coarse-to-fine Framework
Shengqi Xu, Run Sun, Yi Chang*, Shuning Cao, Xueyao Xiao, Luxin Yan
[pdf]
[DOI]

SparseCtrl: Adding Sparse Controls to Text-to-Video Diffusion Models
Yuwei Guo, Ceyuan Yang*, Anyi Rao, Maneesh Agrawala, Dahua Lin*, Bo Dai*
[pdf]
[DOI]

Visual Alignment Pre-training for Sign Language Translation
Peiqi Jiao, Yuecong Min, Xilin Chen*
[pdf]
[DOI]

Parrot Captions Teach CLIP to Spot Text
Yiqi Lin, Conghui He*, Alex Jinpeng Wang, Bin Wang, Weijia Li, Mike Zheng Shou
[pdf]
[DOI]

Solving Motion Planning Tasks with a Scalable Generative Model
Yihan Hu*, Siqi Chai, Zhening Yang, Jingyu Qian, Kun Li, Wenxin Shao, Haichao Zhang, Wei Xu, Qiang Liu*
[pdf]
[DOI]

Griffon: Spelling out All Object Locations at Any Granularity with Large Language Models
Yufei Zhan, Yousong Zhu*, Zhiyang Chen, Fan Yang, Ming Tang, Jinqiao Wang
[pdf]
[DOI]

Vision-Language Action Knowledge Learning for Semantic-Aware Action Quality Assessment
Huangbiao Xu, Xiao Ke*, Yuezhou Li, Rui Xu, Huanqi Wu, Xiaofeng Lin, Wenzhong Guo
[pdf]
[DOI]

Knowledge Transfer with Simulated Inter-Image Erasing for Weakly Supervised Semantic Segmentation
Tao Chen*, Xiruo Jiang, Gensheng Pei, Zeren Sun, Yucheng Wang, Yazhou Yao
[pdf]
[DOI]

BurstM: Deep Burst Multi-scale SR using Fourier Space with Optical Flow
EungGu Kang*, Byeonghun Lee, Sunghoon Im, Kyong Hwan Jin
[pdf]
[DOI]

Diffusion Reward: Learning Rewards via Conditional Video Diffusion
Tao Huang*, Guangqi Jiang, Yanjie Ze, Huazhe Xu*
[pdf]
[DOI]

Recursive Visual Programming
Jiaxin Ge*, Sanjay Subramanian, Baifeng Shi, Roei Herzig, Trevor Darrell
[pdf]
[DOI]

LLaVA-Grounding: Grounded Visual Chat with Large Multimodal Models
Hao Zhang*, Hongyang Li, Feng Li, Tianhe Ren, Xueyan Zou, Shilong Liu, Shijia Huang, Jianfeng Gao, Lei Zhang, Chunyuan Li, Jianwei Yang
[pdf]
[DOI]

Prompt-Driven Contrastive Learning for Transferable Adversarial Attacks
Hunmin Yang, Jongoh Jeong, Kuk-Jin Yoon*
[pdf]
[DOI]

Learning to Adapt SAM for Segmenting Cross-domain Point Clouds
Xidong Peng, Runnan Chen, Feng Qiao, Lingdong Kong, Youquan Liu, Yujing Sun, Tai Wang, Xinge Zhu*, Yuexin Ma*
[pdf]
[DOI]

Learning to Enhance Aperture Phasor Field for Non-Line-of-Sight Imaging
In Cho, Hyunbo Shim, Seon Joo Kim*
[pdf]
[DOI]

ViewFormer: Exploring Spatiotemporal Modeling for Multi-View 3D Occupancy Perception via View-Guided Transformers
Jinke Li*, Xiao He*, Chonghua Zhou, Xiaoqiang Cheng, Yang Wen, Dan Zhang*
[pdf]
[DOI]

Fine-grained Dynamic Network for Generic Event Boundary Detection
Ziwei Zheng, Lijun He, Le Yang, Fan Li*
[pdf]
[DOI]

Take A Step Back: Rethinking the Two Stages in Visual Reasoning
Mingyu Zhang, Jiting Cai, Mingyu Liu, Yue Xu, Cewu Lu, Yong-Lu Li*
[pdf]
[DOI]

AlignZeg: Mitigating Objective Misalignment for Zero-shot Semantic Segmentation
Jiannan Ge*, Lingxi Xie, Hongtao Xie, Pandeng Li, Xiaopeng Zhang, Yongdong Zhang, Qi Tian
[pdf]
[DOI]

Learning with Counterfactual Explanations for Radiology Report Generation
Mingjie Li*, Haokun Lin, Liang Qiu, Xiaodan Liang*, Ling Chen, Abdulmotaleb Elsaddik, Xiaojun Chang
[pdf]
[DOI]

SpeedUpNet: A Plug-and-Play Adapter Network for Accelerating Text-to-Image Diffusion Models
Weilong Chai*, Dandan Zheng, Jiajiong Cao, Zhiquan Chen, Changbao Wang, Chenguang Ma
[pdf]
[DOI]

Better Regression Makes Better Test-time Adaptive 3D Object Detection
Jiakang Yuan, Bo Zhang, Kaixiong Gong, Xiangyu Yue, Botian Shi, Yu Qiao, Tao Chen*
[pdf]
[DOI]

ShapeLLM: Universal 3D Object Understanding for Embodied Interaction
Zekun Qi, Runpei Dong, Shaochen Zhang, Haoran Geng, Chunrui Han, Zheng Ge, Li Yi*, Kaisheng Ma*
[pdf]
[DOI]

Content-Aware Radiance Fields: Aligning Model Complexity with Scene Intricacy Through Learned Bitwidth Quantization
Weihang Liu, Xue Xian Zheng, Jingyi Yu, Xin Lou*
[pdf]
[DOI]

Finding Visual Task Vectors
Alberto Hojel*, Yutong Bai, Trevor Darrell, Amir Globerson, Amir Bar*
[pdf]
[DOI]

Connecting Consistency Distillation to Score Distillation for Text-to-3D Generation
Zongrui Li*, Minghui Hu, Qian Zheng*, Xudong Jiang
[pdf]
[DOI]

Event Camera Data Dense Pre-training
Yan Yang, Liyuan Pan*, Liu liu
[pdf]
[DOI]

Distractors-Immune Representation Learning with Cross-modal Contrastive Regularization for Change Captioning
Yunbin Tu*, Liang Li, Li Su, Chenggang Yan, Qingming Huang
[pdf]
[DOI]

Rethinking Image-to-Video Adaptation: An Object-centric Perspective
Rui Qian*, Shuangrui Ding, Dahua Lin
[pdf]
[DOI]

Layer-Wise Relevance Propagation with Conservation Property for ResNet
Seitaro Otsuki*, Tsumugi Iida*, Félix Doublet*, Tsubasa Hirakawa*, Takayoshi Yamashita*, Hironobu Fujiyoshi*, Komei Sugiura*
[pdf]
[DOI]

DECap: Towards Generalized Explicit Caption Editing via Diffusion Mechanism
Zhen Wang, Xinyun Jiang, Jun Xiao, Tao Chen, Long Chen*
[pdf]
[DOI]

EgoLifter: Open-world 3D Segmentation for Egocentric Perception
Qiao Gu*, Zhaoyang Lv*, Duncan Frost, Simon Green, Julian Straub, Chris Sweeney*
[pdf]
[DOI]

MEVG : Multi-event Video Generation with Text-to-Video Models
Gyeongrok Oh*, Jaehwan Jeong, Sieun Kim, Wonmin Byeon, Jinkyu Kim, Sungwoong Kim, Sangpil Kim*
[pdf]
[DOI]

Open-Vocabulary SAM: Segment and Recognize Twenty-thousand Classes Interactively
Haobo Yuan, Xiangtai Li*, Chong Zhou, Yining Li, Kai Chen, Chen Change Loy
[pdf]
[DOI]

Data-to-Model Distillation: Data-Efficient Learning Framework
Ahmad Sajedi*, Samir Khaki, Lucy Z. Liu, Ehsan Amjadian, Yuri A. Lawryshyn, Konstantinos N. Plataniotis
[pdf]
[DOI]

DiffuX2CT: Diffusion Learning to Reconstruct CT Images from Biplanar X-Rays
Xuhui Liu, Zhi Qiao, Runkun Liu, Hong Li, Xiantong Zhen*, Zhen Qian, Juan Zhang*, Baochang Zhang
[pdf]
[DOI]

AdaIFL: Adaptive Image Forgery Localization via a Dynamic and Importance-aware Transformer Network
Yuxi Li*, Fuyuan Cheng, Wangbo Yu, Guangshuo Wang, Guibo Luo*, Yuesheng Zhu*
[pdf]
[DOI]

ComFusion: Enhancing Personalized Generation by Instance-Scene Compositing and Fusion
Yan Hong*, Yuxuan Duan, Bo Zhang, Haoxing Chen, Jun Lan, Huijia Zhu, Weiqiang Wang, Jianfu Zhang*
[pdf]
[DOI]

ML-SemReg: Boosting Point Cloud Registration with Multi-level Semantic Consistency
Shaocheng Yan, Pengcheng Shi, Jiayuan Li*
[pdf]
[DOI]

Mask as Supervision: Leveraging Unified Mask Information for Unsupervised 3D Pose Estimation
Yuchen Yang, Yu Qiao, Xiao Sun*
[pdf]
[DOI]

MoVideo: Motion-Aware Video Generation with Diffusion Models
Jingyun Liang*, Yuchen Fan, Kai Zhang*, Radu Timofte, Luc Van Gool, Rakesh Ranjan
[pdf]
[DOI]

SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning
Haiwen Diao*, Bo Wan, Xu Jia, Yunzhi Zhuge, Ying Zhang, Huchuan Lu*, Long Chen
[pdf]
[DOI]

MonoTTA: Fully Test-Time Adaptation for Monocular 3D Object Detection
Hongbin Lin, Yifan Zhang, Shuaicheng Niu, Shuguang Cui, Zhen Li*
[pdf]
[DOI]

RangeLDM: Fast Realistic LiDAR Point Cloud Generation
Qianjiang Hu, Zhimin Zhang, Wei Hu*
[pdf]
[DOI]

Learn to Optimize Denoising Scores: A Unified and Improved Diffusion Prior for 3D Generation
Xiaofeng Yang*, Yiwen Chen, Cheng Chen, Chi Zhang, Yi Xu, Xulei Yang, Fayao Liu, Guosheng Lin
[pdf]
[DOI]

Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific Adaptation
Fu-Yun Wang*, Xiaoshi Wu, Zhaoyang Huang, Xiaoyu Shi, Dazhong Shen, Guanglu Song, Yu Liu, Hongsheng Li*
[pdf]
[DOI]

Physically Plausible Color Correction for Neural Radiance Fields
Qi Zhang*, Ying Feng, HONGDONG LI*
[pdf]
[DOI]

Unifying 3D Vision-Language Understanding via Promptable Queries
ziyu zhu*, Zhuofan Zhang, Xiaojian Ma, Xuesong Niu, Yixin Chen, Baoxiong Jia, Zhidong Deng*, Siyuan Huang*, Qing Li*
[pdf]
[DOI]

Model Stock: All we need is just a few fine-tuned models
Dong-Hwan Jang, Sangdoo Yun, Dongyoon Han*
[pdf]
[DOI]

Motion-Guided Latent Diffusion for Temporally Consistent Real-world Video Super-resolution
Xi Yang*, Chenhang He, Jianqi Ma, Lei Zhang
[pdf]
[DOI]

PoseCrafter: One-Shot Personalized Video Synthesis Following Flexible Pose Control
Yong Zhong, Min Zhao, Zebin You, Xiaofeng Yu, Changwang Zhang, Chongxuan Li*
[pdf]
[DOI]

MAD-DR: Map Compression for Visual Localization with Matchness Aware Descriptor Dimension Reduction
Qiang Wang*
[pdf]
[DOI]

Benchmarking Object Detectors with COCO: A New Path Forward
Shweta Singh, Aayan Yadav, Jitesh Jain, Humphrey Shi, Justin Johnson, Karan Desai*
[pdf]
[DOI]

Adaptive High-Frequency Transformer for Diverse Wildlife Re-Identification
Chenyue Li, Shuoyi Chen, Mang Ye*
[pdf]
[DOI]

WPS-SAM: Towards Weakly-Supervised Part Segmentation with Foundation Models
Xin-Jian Wu*, Ruisong Zhang, Jie Qin, Shijie Ma, Cheng-Lin Liu*
[pdf]
[DOI]

Lane Graph as Path: Continuity-preserving Path-wise Modeling for Online Lane Graph Construction
Bencheng Liao, Shaoyu Chen, Bo Jiang, Tianheng Cheng, Qian Zhang, Wenyu Liu, Chang Huang, Xinggang Wang*
[pdf]
[DOI]

DeCo: Decoupled Human-Centered Diffusion Video Editing with Motion Consistency
Xiaojing Zhong, Xinyi Huang, Xiaofeng Yang, Guosheng Lin*, Qingyao Wu*
[pdf]
[DOI]

Unleashing the Potential of the Semantic Latent Space in Diffusion Models for Image Dehazing
Zizheng Yang, Hu Yu, Bing Li, Jinghao Zhang, Jie Huang, Feng Zhao*
[pdf]
[DOI]

Uncertainty-aware sign language video retrieval with probability distribution modeling
Xuan Wu*, Hongxiang Li, yuanjiang luo, Xuxin Cheng, Xianwei Zhuang, Meng Cao, Keren Fu*
[pdf]
[DOI]

NeRMo: Learning Implicit Neural Representations for 3D Human Motion Prediction
Dong Wei, Huaijiang Sun, Xiaoning Sun*, Shengxiang Hu
[pdf]
[DOI]

Bridging Synthetic and Real Worlds for Pre-training Scene Text Detectors
Tongkun Guan, Wei Shen*, Xue Yang, Xuehui Wang, Xiaokang Yang
[pdf]
[DOI]

VLAD-BuFF: Burst-aware Fast Feature Aggregation for Visual Place Recognition
Ahmad Khaliq, Ming Xu, Stephen Hausler, Michael J Milford, Sourav Garg*
[pdf]
[DOI]

DSA: Discriminative Scatter Analysis for Early Smoke Segmentation
Lujian Yao*, Haitao Zhao*, Jingchao Peng, Zhongze Wang, Kaijie Zhao
[pdf]
[DOI]

SAFARI: Adaptive Sequence Transformer for Weakly Supervised Referring Expression Segmentation
Sayan Nag*, Koustava Goswami, Srikrishna Karanam
[pdf]
[DOI]

KFD-NeRF: Rethinking Dynamic NeRF with Kalman Filter
Yifan Zhan, Zhuoxiao Li, Muyao Niu, Zhihang Zhong, Shohei Nobuhara, Ko Nishino, Yinqiang Zheng*
[pdf]
[DOI]

Physical-Based Event Camera Simulator
Haiqian Han, Jiacheng Lyu, Jianing Li*, Henglu Wei, Cheng Li, Yajing Wei, SHU CHEN, Xiangyang Ji*
[pdf]
[DOI]

V-IRL: Grounding Virtual Intelligence in Real Life
Jihan Yang*, Runyu Ding, Ellis L Brown, Xiaojuan Qi, Saining Xie
[pdf]
[DOI]

Adversarial Prompt Tuning for Vision-Language Models
Jiaming Zhang, Xingjun Ma*, Xin Wang, Lingyu Qiu, Jiaqi Wang, Yu-Gang Jiang, Jitao Sang*
[pdf]
[DOI]

Relightable 3D Gaussians: Realistic Point Cloud Relighting with BRDF Decomposition and Ray Tracing
Jian Gao, chun gu, Youtian Lin, Zhihao Li, Hao Zhu, Xun Cao, Li Zhang*, Yao Yao*
[pdf]
[DOI]

Mono-ViFI: A Unified Learning Framework for Self-supervised Single- and Multi-frame Monocular Depth Estimation
Jinfeng Liu*, Lingtong Kong, Bo Li, Zerong Wang, Hong Gu, Jinwei Chen
[pdf]
[DOI]

CC-SAM: Enhancing SAM with Cross-feature Attention and Context for Ultrasound Image Segmentation
Shreyank N Gowda*, David A Clifton
[pdf]
[DOI]

An Efficient and Effective Transformer Decoder-Based Framework for Multi-Task Visual Grounding
Wei Chen, Long Chen, Yu Wu*
[pdf]
[DOI]

Think2Drive: Efficient Reinforcement Learning by Thinking with Latent World Model for Autonomous Driving (in CARLA-v2)
Qifeng Li*, Xiaosong Jia, Shaobo Wang, Junchi Yan
[pdf]
[DOI]

PanGu-Draw: Advancing Resource-Efficient Text-to-Image Synthesis with Time-Decoupled Training and Reusable Coop-Diffusion
Guansong Lu*, Yuanfan Guo, Jianhua Han, Minzhe Niu, Yihan Zeng, Songcen Xu, Zeyi Huang, Zhao Zhong, Wei Zhang, Hang Xu
[pdf]
[DOI]

"X-InstructBLIP: A Framework for Aligning Image, 3D, Audio, Video to LLMs and its Emergent Cross-modal Reasoning"
Artemis Panagopoulou*, Le Xue, Ning Yu, LI JUNNAN, DONGXU LI, Shafiq Joty, Ran Xu, Silvio Savarese, Caiming Xiong, Juan Carlos Niebles
[pdf]
[DOI]

Learning Neural Volumetric Pose Features for Camera Localization
Jingyu Lin, Jiaqi Gu, Bojian Wu, Lubin Fan*, Renjie Chen*, Ligang Liu, Jieping Ye
[pdf]
[DOI]

Betrayed by Attention: A Simple yet Effective Approach for Self-supervised Video Object Segmentation
Shuangrui Ding*, Rui Qian, Haohang Xu, Dahua Lin, Hongkai Xiong
[pdf]
[DOI]

REFRAME: Reflective Surface Real-Time Rendering for Mobile Devices
Chaojie Ji*, Yufeng Li, Yiyi Liao
[pdf]
[DOI]

Self-Training Room Layout via Geometry-aware Ray-casting
Bolivar Solarte*, Chin-Hsuan Wu*, Jin-Cheng Jhang*, Jonathan Lee*, Yi-Hsuan Tsai*, Min Sun*
[pdf]
[DOI]

Closed-Loop Unsupervised Representation Disentanglement with $\\beta$-VAE Distillation and Diffusion Probabilistic Feedback
Xin Jin*, Bohan Li*, Baao Xie, Wenyao Zhang, Jinming Liu, Ziqiang Li, Tao Yang, Wenjun Zeng
[pdf]
[DOI]

Rethinking Weakly-supervised Video Temporal Grounding From a Game Perspective
Xiang Fang, Zeyu Xiong, Wanlong Fang, Xiaoye Qu, Chen Chen, Jianfeng Dong, Keke Tang, Pan Zhou*, Yu Cheng, Daizong Liu*
[pdf]
[DOI]

Every Pixel Has its Moments: Ultra-High-Resolution Unpaired Image-to-Image Translation via Dense Normalization
Ming-Yang Ho, Che-Ming Wu, Min-Sheng Wu, ‪Yufeng Jane Tseng*
[pdf]
[DOI]

ZoLA: Zero-Shot Creative Long Animation Generation with Short Video Model
Fu-Yun Wang*, Zhaoyang Huang*, Qiang Ma, Guanglu Song, Xudong LU, Weikang Bian, Yijin Li, Yu Liu, Hongsheng Li*
[pdf]
[DOI]

Parameter-Efficient and Memory-Efficient Tuning for Vision Transformer: A Disentangled Approach
Taolin Zhang, Jiawang Bai, Zhihe Lu, Dongze Lian, genping wang*, Xinchao Wang*, Shu-Tao Xia
[pdf]
[DOI]

Restore Anything with Masks: Leveraging Mask Image Modeling for Blind All-in-One Image Restoration
Chujie Qin, Ruiqi Wu, Zikun Liu, Xin Lin, Chun-Le Guo, Hyun Hee Park, Chongyi Li*
[pdf]
[DOI]

When Fast Fourier Transform Meets Transformer for Image Restoration
Xingyu Jiang, Xiuhui Zhang, Ning Gao, Yue Deng*
[pdf]
[DOI]

Dolphins: Multimodal Language Model for Driving
Yingzi Ma, Yulong Cao, Jiachen Sun, Marco Pavone, Chaowei Xiao*
[pdf]
[DOI]

Rethinking Video Deblurring with Wavelet-Aware Dynamic Transformer and Diffusion Model
Chen Rao, Guangyuan Li, Zehua Lan, Jiakai Sun, Junsheng Luan, Wei Xing*, Lei Zhao*, Huaizhong Lin*, Jianfeng Dong, Dalong Zhang
[pdf]
[DOI]

CamoTeacher: Dual-Rotation Consistency Learning for Semi-Supervised Camouflaged Object Detection
xunfa lai, Zhiyu Yang, Jie Hu, ShengChuan Zhang*, Liujuan Cao, Guannan Jiang, Songan Zhang, zhiyu wang, Rongrong Ji
[pdf]
[DOI]

Placing Objects in Context via Inpainting for Out-of-distribution Segmentation
Pau de Jorge Aranda*, Riccardo Volpi, Puneet Dokania, Philip Torr, Gregory Rogez
[pdf]
[DOI]

Textual Grounding for Open-vocabulary Visual Information Extraction in Layout-diversified Documents
Mengjun Cheng, Chengquan Zhang, Chang Liu*, Yuke Li, Bohan Li, Kun Yao, Xiawu Zheng, Rongrong Ji, Jie Chen
[pdf]
[DOI]

Teddy: Efficient Large-Scale Dataset Distillation via Taylor-Approximated Matching
Ruonan Yu, Songhua Liu, Jingwen Ye, Xinchao Wang*
[pdf]
[DOI]

Rethinking and Improving Visual Prompt Selection for In-Context Learning Segmentation Framework
Wei Suo, Lanqing Lai, Mengyang Sun, Hanwang Zhang, Peng Wang*, Yanning Zhang
[pdf]
[DOI]

D4-VTON: Dynamic Semantics Disentangling for Differential Diffusion based Virtual Try-On
Zhaotong Yang, Zicheng Jiang, Xinzhe Li, Huiyu Zhou, Junyu Dong, Huaidong Zhang, Yong Du*
[pdf]
[DOI]

TC4D: Trajectory-Conditioned Text-to-4D Generation
Sherwin Bahmani*, Xian Liu, Wang Yifan, Ivan Skorokhodov, Victor Rong, Ziwei Liu, Xihui Liu, Jeong Joon Park, Sergey Tulyakov, Gordon Wetzstein, Andrea Tagliasacchi, David B Lindell
[pdf]
[DOI]

Blind Image Deconvolution by Generative-based Kernel Prior and Initializer via Latent Encoding
Jiangtao Zhang, Zongsheng Yue*, Hui Wang, Qian Zhao*, Deyu Meng
[pdf]
[DOI]

AdvDiff: Generating Unrestricted Adversarial Examples using Diffusion Models
Xuelong Dai*, Kaisheng Liang, Bin Xiao
[pdf]
[DOI]

Improving Text-guided Object Inpainting with Semantic Pre-inpainting
Yifu Chen, Jingwen Chen, Yingwei Pan*, Yehao Li, Ting Yao, Zhineng Chen, Tao Mei
[pdf]
[DOI]

Personalized Federated Domain-Incremental Learning based on Adaptive Knowledge Matching
Yichen Li, Wenchao Xu, Haozhao Wang*, Yining Qi*, Jingcai Guo, Ruixuan Li*
[pdf]
[DOI]

ST-LDM: A Universal Framework for Text-Grounded Object Generation in Real Images
Xiangtian Xue, Jiasong Wu*, Youyong Kong, Lotfi Senhadji, Huazhong Shu
[pdf]
[DOI]

RS-NeRF: Neural Radiance Fields from Rolling Shutter Images
Muyao Niu, Tong Chen, Yifan Zhan, Zhuoxiao Li, Xiang Ji, Yinqiang Zheng*
[pdf]
[DOI]

Region-Adaptive Transform with Segmentation Prior for Image Compression
Yuxi Liu*, Wenhan Yang, Huihui Bai, Yunchao Wei, Yao Zhao
[pdf]
[DOI]

Enhancing Tracking Robustness with Auxiliary Adversarial Defense Networks
Zhewei Wu, Ruilong Yu, Qihe Liu*, Shuying Cheng, Shilin Qiu, Shijie Zhou
[pdf]
[DOI]

SLIM: Spuriousness Mitigation with Minimal Human Annotations
Xiwei Xuan*, Ziquan Deng, Hsuan-Tien Lin, Kwan-Liu Ma
[pdf]
[DOI]

Uncertainty Calibration with Energy Based Instance-wise Scaling in the Wild Dataset
Mijoo Kim, Junseok Kwon*
[pdf]
[DOI]

X-Pose: Detecting Any Keypoints
Jie Yang, Ailing Zeng*, Ruimao Zhang*, Lei Zhang
[pdf]
[DOI]

M^2Depth: Self-supervised Two-Frame Multi-camera Metric Depth Estimation
Yingshuang Zou*, Yikang Ding, Xi Qiu, Haoqian Wang*, Haotian Zhang*
[pdf]
[DOI]

UniMD: Towards Unifying Moment Retrieval and Temporal Action Detection
Yingsen Zeng, Yujie Zhong*, Chengjian Feng, Lin Ma
[pdf]
[DOI]

DyFADet: Dynamic Feature Aggregation for Temporal Action Detection
Le Yang*, Ziwei Zheng, Yizeng Han, Hao Cheng, Shiji Song, Gao Huang, Fan Li
[pdf]
[DOI]

LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models
Yanwei Li*, Chengyao Wang, Jiaya Jia
[pdf]
[DOI]

MetaCap: Meta-learning Priors from Multi-View Imagery for Sparse-view Human Performance Capture and Rendering
Guoxing Sun*, Rishabh Dabral, Pascal Fua, Christian Theobalt, Marc Habermann
[pdf]
[DOI]

DiffPMAE: Diffusion Masked Autoencoders for Point Cloud Reconstruction
Yanlong LI*, Chamara Madarasingha, Kanchana Thilakarathna
[pdf]
[DOI]

Multi-branch Collaborative Learning Network for 3D Visual Grounding
Zhipeng Qian, Yiwei Ma, Zhekai Lin, Jiayi Ji, Xiawu Zheng, Xiaoshuai Sun*, Rongrong Ji
[pdf]
[DOI]

DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors
Jinbo Xing*, Menghan Xia, Yong Zhang, Haoxin Chen, Wangbo Yu, Hanyuan Liu, Gongye Liu, Xintao Wang, Ying Shan, Tien-Tsin Wong
[pdf]
[DOI]

Motion Aware Event Representation-driven Image Deblurring
Zhijing Sun, Xueyang Fu, Longzhuo Huang, Aiping Liu, Zheng-Jun Zha*
[pdf]
[DOI]

Turbo: Informativity-Driven Acceleration Plug-In for Vision-Language Large Models
Chen Ju*, Haicheng Wang, Haozhe Cheng, Xu Chen, Zhonghua Zhai, Weilin Huang, Jinsong Lan, Shuai Xiao*, Bo Zheng
[pdf]
[DOI]

WildRefer: 3D Object Localization in Large-scale Dynamic Scenes with Multi-modal Visual Data and Natural Language
Zhenxiang Lin, Xidong Peng, Peishan Cong, Ge Zheng, Yujing Sun, Yuenan HOU, Xinge Zhu, Sibei Yang, Yuexin Ma*
[pdf]
[DOI]

RCS-Prompt: Learning Prompt to Rearrange Class Space for Prompt-based Continual Learning
Longrong Yang, Hanbin Zhao, Yunlong Yu*, Xiaodong Zeng, Xi Li*
[pdf]
[DOI]

Text-Anchored Score Composition: Tackling Condition Misalignment in Text-to-Image Diffusion Models
Luozhou Wang*, Guibao Shen, Wenhang Ge, Guangyong Chen, Yijun Li, Yingcong Chen*
[pdf]
[DOI]

Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection
Shilong Liu*, Zhaoyang Zeng, Tianhe Ren, Feng Li, Hao Zhang, Jie Yang, Qing Jiang, Chunyuan Li, Jianwei Yang, Hang Su, Jun Zhu, Lei Zhang*
[pdf]
[DOI]

Make Your ViT-based Multi-view 3D Detectors Faster via Token Compression
Dingyuan Zhang, Dingkang Liang*, Zichang Tan, Xiaoqing Ye, Cheng Zhang, Jingdong Wang, Xiang Bai*
[pdf]
[DOI]

OV-Uni3DETR: Towards Unified Open-Vocabulary 3D Object Detection via Cycle-Modality Propagation
Zhenyu Wang*, Ya-Li Li, TAICHI LIU, Hengshuang Zhao, Shengjin Wang
[pdf]
[DOI]

CatchBackdoor: Backdoor Detection via Critical Trojan Neural Path Fuzzing
Haibo Jin, Ruoxi Chen, Jinyin Chen, Haibin Zheng, Yang Zhang, Haohan Wang*
[pdf]
[DOI]

UCIP: A Universal Framework for Compressed Image Super-Resolution using Dynamic Prompt
Xin Li*, Bingchen Li, Yeying Jin, Cuiling Lan, Hanxin Zhu, Yulin Ren, Zhibo Chen
[pdf]
[DOI]

LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents
Shilong Liu*, Hao Cheng, Haotian Liu, Hao Zhang, Feng Li, Tianhe Ren, Xueyan Zou, Jianwei Yang, Hang Su, Jun Zhu, Lei Zhang, Jianfeng Gao, Chunyuan Li*
[pdf]
[DOI]

ClearCLIP: Decomposing CLIP Representations for Dense Vision-Language Inference
Mengcheng Lan, Chaofeng Chen, Yiping Ke, Xinjiang Wang, Litong Feng*, Wayne Zhang
[pdf]
[DOI]

Two-Stage Active Learning for Efficient Temporal Action Segmentation
Yuhao Su, Ehsan Elhamifar*
[pdf]
[DOI]

TexDreamer: Towards Zero-Shot High-Fidelity 3D Human Texture Generation
Yufei Liu, Junwei Zhu, Junshu Tang, Shijie Zhang, Jiangning Zhang, Weijian Cao, Chengjie Wang, Yunsheng Wu, Dongjin Huang*
[pdf]
[DOI]

MVPGS: Excavating Multi-view Priors for Gaussian Splatting from Sparse Input Views
Wangze Xu, Huachen Gao, Shihe Shen, Rui Peng, Jianbo Jiao, Ronggang Wang*
[pdf]
[DOI]

Domain-Adaptive 2D Human Pose Estimation via Dual Teachers in Extremely Low-Light Conditions
Yihao Ai*, Yifei Qi, Bo Wang, Yu Cheng, Xinchao Wang, Robby T. Tan
[pdf]
[DOI]

Towards More Practical Group Activity Detection: A New Benchmark and Model
Dongkeun Kim, Youngkil Song, Minsu Cho, Suha Kwak*
[pdf]
[DOI]

Depicting Beyond Scores: Advancing Image Quality Assessment through Multi-modal Language Models
Zhiyuan You*, Zheyuan Li, Jinjin Gu*, Zhenfei Yin, Tianfan Xue*, Chao Dong*
[pdf]
[DOI]

Zero-Shot Image Feature Consensus with Deep Functional Maps
Xinle Cheng, Congyue Deng*, Adam Harley, Yixin Zhu*, Leonidas Guibas*
[pdf]
[DOI]

WindPoly: Polygonal Mesh Reconstruction via Winding Numbers
Xin He, Chenlei Lv, Pengdi Huang, Hui Huang*
[pdf]
[DOI]

MinD-3D: Reconstruct High-quality 3D objects in Human Brain
Jianxiong Gao, Yuqian Fu, Yun Wang, Xuelin Qian, Jianfeng Feng, Yanwei Fu*
[pdf]
[DOI]

Tokenize Anything via Prompting
Ting Pan*, Lulu Tang, Xinlong Wang*, Shiguang Shan
[pdf]
[DOI]

Geospecific View Generation - Geometry-Context Aware High-resolution Ground View Inference from Satellite Views
Ningli Xu, Rongjun Qin*
[pdf]
[DOI]

Scissorhands: Scrub Data Influence via Connection Sensitivity in Networks
Jing Wu*, Mehrtash Harandi
[pdf]
[DOI]

City-on-Web: Real-time Neural Rendering of Large-scale Scenes on the Web
Kaiwen Song, Xiaoyi Zeng, Chenqu Ren, Juyong Zhang*
[pdf]
[DOI]

GRAPE: Generalizable and Robust Multi-view Facial Capture
Jing Li, Di Kang, Zhenyu He*
[pdf]
[DOI]

Training-Free Model Merging for Multi-target Domain Adaptation
Wenyi Li, Huan-ang Gao, Mingju Gao, Beiwen Tian, Rong Zhi, Hao Zhao*
[pdf]
[DOI]

Multi-RoI Human Mesh Recovery with Camera Consistency and Contrastive Losses
Yongwei Nie, Changzhen Liu, Chengjiang Long, Qing Zhang, Guiqing Li, Hongmin Cai*
[pdf]
[DOI]

Co-Student: Collaborating Strong and Weak Students for Sparsely Annotated Object Detection
Lianjun Wu, Jiangxiao Han, Zengqiang Zheng, Xinggang Wang*
[pdf]
[DOI]

Open-Vocabulary Camouflaged Object Segmentation
Youwei Pang, Xiaoqi Zhao, JiaMing Zuo, Lihe Zhang*, Huchuan Lu
[pdf]
[DOI]

SmartControl: Enhancing ControlNet for Handling Rough Visual Conditions
Xiaoyu Liu, Yuxiang Wei, Ming Liu*, Xianhui Lin, Peiran Ren, xuansong xie, Wangmeng Zuo
[pdf]
[DOI]

InterFusion: Text-Driven Generation of 3D Human-Object Interaction
Sisi Dai, Wenhao Li, Haowen Sun, Haibin Huang, Chongyang Ma, Hui Huang, Kai Xu*, Ruizhen Hu*
[pdf]
[DOI]

GLARE: Low Light Image Enhancement via Generative Latent Feature based Codebook Retrieval
Han Zhou, Wei Dong, Xiaohong Liu*, Shuaicheng Liu, Xiongkuo Min, Guangtao Zhai, Jun Chen*
[pdf]
[DOI]

DriveDreamer: Towards Real-world-driven World Models for Autonomous Driving
Xiaofeng Wang*, Zheng Zhu, Guan Huang, Chen Xinze, Jiagang Zhu, Jiwen Lu
[pdf]
[DOI]

Flow-Assisted Motion Learning Network for Weakly-Supervised Group Activity Recognition
Muhammad Adi Nugroho*, Sangmin Woo, Sumin Lee, Jinyoung Park, Yooseung Wang, Donguk Kim, Changick Kim
[pdf]
[DOI]

NeRF-XL: NeRF at Any Scale with Multi-GPU
Ruilong Li*, Sanja Fidler, Angjoo Kanazawa, Francis Williams
[pdf]
[DOI]

CoSIGN: Few-Step Guidance of ConSIstency Model to Solve General INverse Problems
Jiankun Zhao, Bowen Song, Liyue Shen*
[pdf]
[DOI]

The First to Know: How Token Distributions Reveal Hidden Knowledge in Large Vision-Language Models?
Qinyu Zhao*, Ming Xu, Kartik Gupta, Akshay Asthana, Liang Zheng, Stephen Gould
[pdf]
[DOI]

Compositional Substitutivity of Visual Reasoning for Visual Question Answering
Chuanhao Li, Zhen Li, Chenchen Jing*, Yuwei Wu*, Mingliang Zhai, Yunde Jia
[pdf]
[DOI]

LightenDiffusion: Unsupervised Low-Light Image Enhancement with Latent-Retinex Diffusion Models
Hai Jiang, Ao Luo, Xiaohong Liu, Songchen Han, Shuaicheng Liu*
[pdf]
[DOI]

DNI: Dilutional Noise Initialization for Diffusion Video Editing
Sunjae Yoon, Gwanhyeong Koo, Ji Woo Hong, Chang D. Yoo*
[pdf]
[DOI]

Two-Stage Video Shadow Detection via Temporal-Spatial Adaption
Xin Duan, Yu Cao, Lei Zhu, Gang Fu, Xin Wang, Renjie ZHANG, Ping Li*
[pdf]
[DOI]

Towards Physical World Backdoor Attacks against Skeleton Action Recognition
Qichen Zheng, Yi Yu, SIYUAN YANG*, Jun Liu, Kwok-Yan Lam, Alex Kot
[pdf]
[DOI]

SAM-guided Graph Cut for 3D Instance Segmentation
Haoyu Guo*, He Zhu, Sida Peng, Yuang Wang, Yujun Shen, Ruizhen Hu*, Xiaowei Zhou*
[pdf]
[DOI]

Fully Authentic Visual Question Answering Dataset from Online Communities
Chongyan Chen*, Mengchen Liu, Noel C Codella, Yunsheng Li, Lu Yuan, Danna Gurari
[pdf]
[DOI]

Active Generation for Image Classification
Tao Huang, Jiaqi Liu, Shan You*, Chang Xu
[pdf]
[DOI]

FuseTeacher: Modality-fused Encoders are Strong Vision Supervisors
Chen-Wei Xie*, Siyang Sun, Liming Zhao, Pandeng Li, Shuailei Ma, Yun Zheng
[pdf]
[DOI]

Learning Local Pattern Modularization for Point Cloud Reconstruction from Unseen Classes
Chao Chen, Yu-Shen Liu*, Zhizhong Han
[pdf]
[DOI]

Understanding Multi-compositional learning in Vision and Language models via Category Theory
Sotirios Panagiotis Chytas*, Hyunwoo J Kim, Vikas Singh
[pdf]
[DOI]

FedRA: A Random Allocation Strategy for Federated Tuning to Unleash the Power of Heterogeneous Clients
Shangchao Su, Bin Li*, Xiangyang Xue
[pdf]
[DOI]

Panel-Specific Degradation Representation for Raw Under-Display Camera Image Restoration
Youngjin Oh*, Keuntek Lee, Jooyoung Lee, Dae-Hyun Lee, Nam Ik Cho
[pdf]
[DOI]

Unlocking Textual and Visual Wisdom: Open-Vocabulary 3D Object Detection Enhanced by Comprehensive Guidance from Text and Image
Pengkun Jiao*, Na Zhao*, Jingjing Chen, Yu-Gang Jiang
[pdf]
[DOI]

Diffusion-Guided Weakly Supervised Semantic Segmentation
Sung-Hoon Yoon, Hoyong Kwon, Jaeseok Jeong, Daehee Park, Kuk-Jin Yoon*
[pdf]
[DOI]

Weakly-Supervised Spatio-Temporal Video Grounding with Variational Cross-Modal Alignment
Yang Jin*, Yadong Mu*
[pdf]
[DOI]

When Pedestrian Detection Meets Multi-Modal Learning: Generalist Model and Benchmark Dataset
Yi Zhang, Wang Zeng, Sheng Jin, Chen Qian*, Ping Luo, Wentao Liu
[pdf]
[DOI]

NVS-Adapter: Plug-and-Play Novel View Synthesis from a Single Image
Yoonwoo Jeong, Jinwoo Lee, Chiheon Kim, Minsu Cho*, Doyup Lee*
[pdf]
[DOI]

Segment and Recognize Anything at Any Granularity
Feng Li*, Hao Zhang, Peize Sun, Xueyan Zou, Shilong Liu, Chunyuan Li, Jianwei Yang, Lei Zhang*, Jianfeng Gao*
[pdf]
[DOI]

Real-time Holistic Robot Pose Estimation with Unknown States
Shikun Ban, Juling Fan, Xiaoxuan Ma, Wentao Zhu*, Yu QIAO*, Yizhou Wang
[pdf]
[DOI]

CLOSER: Towards Better Representation Learning for Few-Shot Class-Incremental Learning
Junghun Oh, Sungyong Baik, Kyoung Mu Lee*
[pdf]
[DOI]

A Simple Baseline for Spoken Language to Sign Language Translation with 3D Avatars
Ronglai Zuo, Fangyun Wei*, Zenggui Chen, Brian Mak, Jiaolong Yang, Xin Tong
[pdf]
[DOI]

An accurate detection is not all you need to combat label noise in web-noisy datasets
Paul Albert*, Kevin McGuinness, Eric Arazo, Tarun Krishna, Noel O Connor, Jack Valmadre
[pdf]
[DOI]

Online Vectorized HD Map Construction using Geometry
Zhixin Zhang, Yiyuan Zhang, Xiaohan Ding, Fusheng Jin*, Xiangyu Yue
[pdf]
[DOI]

Image-adaptive 3D Lookup Tables for Real-time Image Enhancement with Bilateral Grids
Wontae Kim*, Nam Ik Cho*
[pdf]
[DOI]

Learned HDR Image Compression for Perceptually Optimal Storage and Display
Peibei Cao, HAOYU CHEN, Jingzhe Ma, Yu-Chieh Yuan, Zhiyong Xie, Xin Xie, Haiqing Bai, Kede Ma*
[pdf]
[DOI]

Sparse Beats Dense: Rethinking Supervision in Radar-Camera Depth Completion
Huadong Li, Minhao Jing, Jin Wang, Shichao Dong, Jiajun Liang, Haoqiang Fan, Renhe Ji*
[pdf]
[DOI]

Non-Exemplar Domain Incremental Learning via Cross-Domain Concept Integration
Qiang Wang*, Yuhang He, Songlin Dong, Xinyuan Gao, Shaokun Wang, Yihong Gong
[pdf]
[DOI]

Free-VSC: Free Semantics from Visual Foundation Models for Unsupervised Video Semantic Compression
Yuan Tian*, Guo Lu*, Guangtao Zhai*
[pdf]
[DOI]

Improving Virtual Try-On with Garment-focused Diffusion Models
Siqi Wan, Yehao Li, Jingwen Chen, Yingwei Pan*, Ting Yao, Yang Cao, Tao Mei
[pdf]
[DOI]

Ray Denoising: Depth-aware Hard Negative Sampling for Multi-view 3D Object Detection
Feng Liu*, Tengteng Huang, Qianjing Zhang, Haotian Yao, Chi Zhang, Fang Wan, Qixiang Ye, Yanzhao Zhou*
[pdf]
[DOI]

Disentangled Generation and Aggregation for Robust Radiance Fields
Shihe Shen, Huachen Gao, Wangze Xu, Rui Peng, Luyang Tang, Kaiqiang Xiong, Jianbo Jiao, Ronggang Wang*
[pdf]
[DOI]

UNIKD: UNcertainty-Filtered Incremental Knowledge Distillation for Neural Implicit Representation
Mengqi Guo*, Chen Li, Hanlin Chen, Gim Hee Lee
[pdf]
[DOI]

Subspace Prototype Guidance for Mitigating Class Imbalance in Point Cloud Semantic Segmentation
Jiawei Han, Kaiqi Liu*, Wei Li, Guangzhi Chen
[pdf]
[DOI]

MoAI: Mixture of All Intelligence for Large Language and Vision Models
Byung-Kwan Lee, Beomchan Park, Chae Won Kim, Yong Man Ro*
[pdf]
[DOI]

Semantic-guided Robustness Tuning for Few-Shot Transfer Across Extreme Domain Shift
kangyu xiao*, Zilei Wang, junjie li
[pdf]
[DOI]

Revisit Event Generation Model: Self-Supervised Learning of Event-to-Video Reconstruction with Implicit Neural Representations
Zipeng Wang*, yunfan lu, Lin Wang*
[pdf]
[DOI]

SDPT: Synchronous Dual Prompt Tuning for Fusion-based Visual-Language Pre-trained Models
Yang Zhou*, Yongjian Wu, Jiya Saiyin, Bingzheng Wei, Maode Lai, Eric I Chang, Yan Xu*
[pdf]
[DOI]

Open-World Dynamic Prompt and Continual Visual Representation Learning
Youngeun Kim, Jun Fang*, Qin Zhang, Zhaowei Cai, Yantao Shen, Rahul Duggal, Dripta S. Raychaudhuri, Zhuowen Tu, Yifan Xing, Onkar Dabeer
[pdf]
[DOI]

Learning Video Context as Interleaved Multimodal Sequences
Kevin Qinghong Lin, Pengchuan Zhang, Difei Gao, Xide Xia, Joya Chen, Ziteng Gao, Jinheng Xie, Xuhong Xiao, Mike Zheng Shou*
[pdf]
[DOI]

Learning Unsigned Distance Functions from Multi-view Images with Volume Rendering Priors
Wenyuan Zhang, Kanle Shi, Yu-Shen Liu*, Zhizhong Han
[pdf]
[DOI]

Dense Multimodal Alignment for Open-Vocabulary 3D Scene Understanding
Ruihuang Li*, Zhengqiang ZHANG, Chenhang He, Zhiyuan Ma, Vishal Patel, Lei Zhang
[pdf]
[DOI]

Deep Feature Surgery: Towards Accurate and Efficient Multi-Exit Networks
Cheng Gong, Yao Chen*, Qiuyang Luo, Ye Lu, Tao Li, Yuzhi Zhang, Yufei Sun*, Le Zhang
[pdf]
[DOI]

Multi-scale Cross Distillation for Object Detection in Aerial Images
Kun Wang, Zi Wang, Zhang Li*, Xichao Teng, Yang Li
[pdf]
[DOI]

Progressive Proxy Anchor Propagation for Unsupervised Semantic Segmentation
Hyun Seok Seong, WonJun Moon, SuBeen Lee, Jae-Pil Heo*
[pdf]
[DOI]

Within the Dynamic Context: Inertia-aware 3D Human Modeling with Pose Sequence
Yutong Chen, Yifan Zhan, Zhihang Zhong*, Wei Wang, Xiao Sun*, Yu Qiao, Yinqiang Zheng
[pdf]
[DOI]

Revisit Human-Scene Interaction via Space Occupancy
Xinpeng Liu, Haowen Hou, Yanchao Yang, Yong-Lu Li*, Cewu Lu
[pdf]
[DOI]

Face Adapter for Pre-Trained Diffusion Models with Fine-Grained ID and Attribute Control
Yue Han*, Junwei Zhu, Keke He, Xu Chen, Yanhao Ge, Wei Li, Xiangtai Li, Jiangning Zhang, Chengjie Wang, Yong Liu
[pdf]
[DOI]

WeConvene: Learned Image Compression with Wavelet-Domain Convolution and Entropy Model
Haisheng Fu*, Jie Liang, Zhenman Fang, Jingning Han, Feng Liang, Guohe Zhang
[pdf]
[DOI]

Grid-Attention: Enhancing Computational Efficiency of Large Vision Models without Fine-Tuning
Pengyu Li*, biao wang, Tianchu Guo, Xian-Sheng Hua
[pdf]
[DOI]

Mitigating Background Shift in Class-Incremental Semantic Segmentation
Gilhan Park, WonJun Moon, SuBeen Lee, Tae-Young Kim, Jae-Pil Heo*
[pdf]
[DOI]

Relation DETR: Exploring Explicit Position Relation Prior for Object Detection
Xiuquan Hou, Meiqin Liu*, Senlin Zhang, Ping Wei, Badong Chen, Xuguang Lan
[pdf]
[DOI]

BKDSNN: Enhancing the Performance of Learning-based Spiking Neural Networks Training with Blurred Knowledge Distillation
Zekai Xu, Kang You, Qinghai Guo, Xiang Wang, Zhezhi He*
[pdf]
[DOI]

Agent Attention: On the Integration of Softmax and Linear Attention
Dongchen Han, Tianzhu Ye, Yizeng Han, Zhuofan Xia, Siyuan Pan, Pengfei Wan, Shiji Song, Gao Huang*
[pdf]
[DOI]

Learning by Aligning 2D Skeleton Sequences and Multi-Modality Fusion
Quoc-Huy Tran*, Muhammad Ahmed, Murad Popattia, Muhammad Hassan Ahmed, Andrey Konin, Zeeshan Zia
[pdf]
[DOI]

Resolving Scale Ambiguity in Multi-view 3D Reconstruction using Dual-Pixel Sensors
Kohei Ashida*, Hiroaki Santo, Fumio Okura, Yasuyuki Matsushita
[pdf]
[DOI]

Object-Oriented Anchoring and Modal Alignment in Multimodal Learning
Shibin Mei, Bingbing Ni*, Hang Wang, Chenglong Zhao, fengfa hu, Zhiming Pi, BiLian Ke
[pdf]
[DOI]

Towards Stable 3D Object Detection
Jiabao Wang, Qiang Meng, Guochao Liu, Liujiang Yan, Ke Wang, Ming-Ming Cheng, Qibin Hou*
[pdf]
[DOI]

FYI: Flip Your Images for Dataset Distillation
Byunggwan Son*, Youngmin Oh, Donghyeon Baek, Bumsub Ham*
[pdf]
[DOI]

On-the-fly Category Discovery for LiDAR Semantic Segmentation
Hyeonseong Kim, Sung-Hoon Yoon, Minseok Kim, Kuk-Jin Yoon*
[pdf]
[DOI]

Dual-Camera Smooth Zoom on Mobile Phones
Renlong Wu, Zhilu Zhang*, Yu Yang, Wangmeng Zuo
[pdf]
[DOI]

ProtoComp: Diverse Point Cloud Completion with Controllable Prototype
Xumin Yu, Yanbo Wang, Jie Zhou, Jiwen Lu*
[pdf]
[DOI]

CONDA: Condensed Deep Association Learning for Co-Salient Object Detection.
Long Li, Nian Liu*, Dingwen Zhang, Zhongyu Li, Salman Khan, Rao Anwer, Hisham Cholakkal, Junwei Han*, Fahad Shahbaz Khan
[pdf]
[DOI]

Cascade Prompt Learning for Visual-Language Model Adaptation
Ge Wu, Xin Zhang, Zheng Li, Zhaowei Chen, Jiajun Liang, Jian Yang, Xiang Li*
[pdf]
[DOI]

PolyRoom: Room-aware Transformer for Floorplan Reconstruction
Yuzhou Liu, Lingjie Zhu, Xiaodong Ma, Hanqiao Ye, Xiang Gao, Xianwei Zheng, Shuhan Shen*
[pdf]
[DOI]

BenchLMM: Benchmarking Cross-style Visual Capability of Large Multimodal Models
Rizhao Cai*, Zirui Song, Dayan Guan*, Zhenhao Chen, Yaohang Li, Xing Luo, Chenyu Yi, Alex Kot
[pdf]
[DOI]

SMFANet: A Lightweight Self-Modulation Feature Aggregation Network for Efficient Image Super-Resolution
mingjun zheng, Long Sun, Jiangxin Dong, Jinshan Pan*
[pdf]
[DOI]

HENet: Hybrid Encoding for End-to-end Multi-task 3D Perception from Multi-view Cameras
Zhongyu Xia, ZhiWei Lin, Xinhao Wang, Yongtao Wang*, Yun Xing, Shengxiang Qi, Nan Dong, Ming-Hsuan Yang
[pdf]
[DOI]

Hierarchical Unsupervised Relation Distillation for Source Free Domain Adaptation
Bowei Xing*, Xianghua Ying, Ruibin Wang, Ruohao Guo, Ji Shi, Wenzhen Yue
[pdf]
[DOI]

Customized Generation Reimagined: Fidelity and Editability Harmonized
Jian Jin, Yang Shen, Zhenyong Fu*, Jian Yang*
[pdf]
[DOI]

AUFormer: Vision Transformers are Parameter-Efficient Facial Action Unit Detectors
Kaishen Yuan, Zitong Yu*, Xin Liu*, Weicheng Xie, Huanjing Yue, Jingyu Yang
[pdf]
[DOI]

Improving Video Segmentation via Dynamic Anchor Queries
Yikang Zhou, Tao Zhang*, Xiangtai Li*, Shunping Ji*, Shuicheng Yan
[pdf]
[DOI]

Controllable Contextualized Image Captioning: Directing the Visual Narrative through User-Defined Highlights
Shunqi Mao*, Chaoyi Zhang, Hang Su, Hwanjun Song, Igor Shalyminov, Weidong Cai
[pdf]
[DOI]

Diffusion Models as Optimizers for Efficient Planning in Offline RL
Renming Huang, Yunqiang Pei, Guoqing Wang*, Yangming Zhang, Yang Yang, Peng Wang, Heng Tao Shen
[pdf]
[DOI]

Enhanced Sparsification via Stimulative Training
Shengji Tang, Weihao Lin, Hancheng Ye, Peng Ye, Chong Yu, Baopu Li, Tao Chen*
[pdf]
[DOI]

How Many Unicorns Are in This Image? A Safety Evaluation Benchmark for Vision LLMs
Haoqin Tu*, Chenhang Cui, Zijun Wang, Yiyang Zhou, Bingchen Zhao, Junlin Han, Wangchunshu Zhou, Huaxiu Yao, Cihang Xie*
[pdf]
[DOI]

NeuroPictor: Refining fMRI-to-Image Reconstruction via Multi-individual Pretraining and Multi-level Modulation
Jingyang Huo, Yikai Wang, Yanwei Fu*, Xuelin Qian, Chong Li, Yun Wang, Jianfeng Feng
[pdf]
[DOI]

Coarse-to-Fine Implicit Representation Learning for 3D Hand-Object Reconstruction from a Single RGB-D Image
Xingyu Liu, Pengfei Ren, Jingyu Wang*, Qi Qi, Haifeng Sun, Zirui Zhuang*, Jianxin Liao
[pdf]
[DOI]

Efficient Snapshot Spectral Imaging: Calibration-Free Parallel Structure with Aperture Diffraction Fusion
Tao Lv*, Lihao Hu, Shiqiao Li, Chenglong Huang, Xun Cao
[pdf]
[DOI]

Enhancing Recipe Retrieval with Foundation Models: A Data Augmentation Perspective
Fangzhou Song, Bin Zhu, Yanbin Hao*, Shuo Wang
[pdf]
[DOI]

PapMOT: Exploring Adversarial Patch Attack against Multiple Object Tracking
Jiahuan Long*, Tingsong Jiang*, Wen Yao*, Shuai Jia*, Weijia Zhang*, Weien Zhou*, Chao Ma*, Xiaoqian Chen*
[pdf]
[DOI]

HiDiffusion: Unlocking Higher-Resolution Creativity and Efficiency in Pretrained Diffusion Models
Shen Zhang, Zhaowei CHEN, Zhenyu Zhao, Yuhao Chen, Yao Tang, Jiajun Liang*
[pdf]
[DOI]

On the Approximation Risk of Few-Shot Class-Incremental Learning
Xuan Wang, Zhong Ji*, Xiyao Liu, Yanwei Pang, Jungong Han
[pdf]
[DOI]

Syn-to-Real Domain Adaptation for Point Cloud Completion via Part-based Approach
Yunseo Yang, Jihun Kim, Kuk-Jin Yoon*
[pdf]
[DOI]

Learn to Preserve and Diversify: Parameter-Efficient Group with Orthogonal Regularization for Domain Generalization
Jiajun Hu, Jian Zhang, Lei Qi*, Yinghuan Shi*, Yang Gao
[pdf]
[DOI]

SCOMatch: Alleviating Overtrusting in Open-set Semi-supervised Learning
Zerun Wang*, Liuyu Xiang, Lang Huang, Jiafeng Mao, Ling Xiao, Toshihiko Yamasaki
[pdf]
[DOI]

Region-aware Distribution Contrast: A Novel Approach to Multi-Task Partially Supervised Learning
Meixuan Li, Tianyu Li, Guoqing Wang*, Peng Wang, Yang Yang, Jie Zou
[pdf]
[DOI]

MasterWeaver: Taming Editability and Face Identity for Personalized Text-to-Image Generation
Yuxiang Wei, Zhilong Ji, Jinfeng Bai, Hongzhi Zhang, Lei Zhang*, Wangmeng Zuo*
[pdf]
[DOI]

PointRegGPT: Boosting 3D Point Cloud Registration using Generative Point-Cloud Pairs for Training
Suyi Chen, Hao Xu, Haipeng Li, Kunming Luo, Guanghui Liu, Chi-Wing Fu, Ping Tan, Shuaicheng Liu*
[pdf]
[DOI]

General Geometry-aware Weakly Supervised 3D Object Detection
Guowen Zhang*, Junsong Fan, Liyi Chen, Zhaoxiang Zhang, Zhen Lei, Lei Zhang
[pdf]
[DOI]

Long-CLIP: Unlocking the Long-Text Capability of CLIP
Beichen Zhang*, Pan Zhang, Xiaoyi Dong*, Yuhang Zang, Jiaqi Wang*
[pdf]
[DOI]

Dolfin: Diffusion Layout Transformers without Autoencoder
Yilin Wang, Zeyuan Chen, Liangjun Zhong, Zheng Ding, Zhuowen Tu*
[pdf]
[DOI]

Real-time 3D-aware Portrait Editing from a Single Image
Qingyan Bai*, Zifan Shi, Yinghao Xu, Hao Ouyang, Qiuyu Wang, Ceyuan Yang, Xuan Wang, Gordon Wetzstein, Yujun Shen*, Qifeng Chen*
[pdf]
[DOI]

StructLDM: Structured Latent Diffusion for 3D Human Generation
Tao Hu, Fangzhou Hong, Ziwei Liu*
[pdf]
[DOI]

Image Compression for Machine and Human Vision With Spatial-Frequency Adaptation
Han Li*, Shaohui Li*, Shuangrui Ding, Wenrui Dai*, Maida Cao, Chenglin Li, Junni Zou, Hongkai Xiong
[pdf]
[DOI]

Beyond the Contact: Discovering Comprehensive Affordance for 3D Objects from Pre-trained 2D Diffusion Models
Hyeonwoo Kim, Sookwan Han, Patrick Kwon, Hanbyul Joo*
[pdf]
[DOI]

Norma: A Noise Robust Memory-Augmented Framework for Whole Slide Image Classification
Yu Bai, Bo Zhang*, Zheng Zhang, Shuo Yan, Zibo Ma, Wu Liu, Xiuzhuang Zhou, Xiangyang Gong, Wendong Wang
[pdf]
[DOI]

Continuous Memory Representation for Anomaly Detection
Joo Chan Lee*, Taejune Kim, Eunbyung Park*, Simon S Woo*, Jong Hwan Ko*
[pdf]
[DOI]

InstaStyle: Inversion Noise of a Stylized Image is Secretly a Style Adviser
Xing Cui, Zekun Li, Peipei Li*, Huaibo Huang, Xuannan Liu, Zhaofeng He
[pdf]
[DOI]

PACE: Pose Annotations in Cluttered Environments
Yang You*, kai xiong, Zhening Yang, Zhengxiang Huang, Junwei Zhou, Ruoxi Shi, Zhou FANG, Adam Harley, Leonidas Guibas, Cewu Lu*
[pdf]
[DOI]

CMTA: Cross-Modal Temporal Alignment for Event-guided Video Deblurring
Taewoo Kim, Hoonhee Cho, Kuk-Jin Yoon*
[pdf]
[DOI]

CountFormer: Multi-View Crowd Counting Transformer
Hong Mo*, Xiong Zhang*, Jianchao Tan, Cheng Yang, Qiong Gu, Bo Hang, Wenqi Ren
[pdf]
[DOI]

Textual Knowledge Matters: Cross-Modality Co-Teaching for Generalized Visual Class Discovery
Haiyang Zheng, Nan Pu, Wenjing Li*, Nicu Sebe, Zhun Zhong*
[pdf]
[DOI]

Continuous SO(3) Equivariant Convolution for 3D Point Cloud Analysis
Jaein Kim, HEE BIN YOO, Dong-Sig Han, Yeon-Ji Song, Byoung-Tak Zhang*
[pdf]
[DOI]

EA-VTR: Event-Aware Video-Text Retrieval
Zongyang Ma*, Ziqi Zhang, Yuxin Chen, Zhongang Qi, Chunfeng Yuan, Bing Li, Yingmin Luo, Xu LI, Xiaojuan Qi, Ying Shan, Weiming Hu
[pdf]
[DOI]

Privacy-Preserving Adaptive Re-Identification without Image Transfer
Hamza Rami*, Jhony H. Giraldo, Nicolas Winckler, Stéphane Lathuilière
[pdf]
[DOI]

A Simple Low-bit Quantization Framework for Video Snapshot Compressive Imaging
Miao Cao*, Lishun Wang, Huan Wang, Xin Yuan
[pdf]
[DOI]

DIFFender: Diffusion-Based Adversarial Defense against Patch Attacks
Caixin Kang*, Yinpeng Dong, Zhengyi Wang, Shouwei Ruan, Yubo Chen, Hang Su*, Xingxing Wei*
[pdf]
[DOI]

Hybrid Video Diffusion Models with 2D Triplane and 3D Wavelet Representation
Kihong Kim, Haneol Lee, Jihye Park, Seyeon Kim, Kwang Hee Lee, Seungryong Kim*, Jaejun Yoo*
[pdf]
[DOI]

Background Adaptation with Residual Modeling for Exemplar-Free Class-Incremental Semantic Segmentation
Anqi Zhang, Guangyu Gao*
[pdf]
[DOI]

Efficient Diffusion-Driven Corruption Editor for Test-Time Adaptation
Yeongtak Oh, Jonghyun Lee, Jooyoung Choi, Dahuin Jung, Uiwon Hwang*, Sungroh Yoon*
[pdf]
[DOI]

Learning to Unlearn for Robust Machine Unlearning
Mark He Huang*, Lin Geng Foo, Jun Liu*
[pdf]
[DOI]

Emergent Visual-Semantic Hierarchies in Image-Text Representations
Morris Alper*, Hadar Averbuch-Elor
[pdf]
[DOI]

Context-Guided Spatial Feature Reconstruction for Efficient Semantic Segmentation
Zhenliang Ni, Xinghao Chen*, Yingjie Zhai, Yehui Tang, Yunhe Wang*
[pdf]
[DOI]

DriveLM: Driving with Graph Visual Question Answering
Chonghao Sima*, Katrin Renz, Kashyap Chitta, Li Chen, Zhang Hanxue, Chengen Xie, Jens Beißwenger, Ping Luo, Andreas Geiger, Hongyang Li
[pdf]
[DOI]

Neural Spectral Decomposition for Dataset Distillation
Shaolei Yang, Shen Cheng, Mingbo Hong, Haoqiang Fan, Xing Wei, Shuaicheng Liu*
[pdf]
[DOI]

Beyond Viewpoint: Robust 3D Object Recognition under Arbitrary Views through Joint Multi-Part Representation
Linlong Fan, Ye Huang*, Yanqi Ge, Wen Li, Lixin Duan
[pdf]
[DOI]

Learning Non-Linear Invariants for Unsupervised Out-of-Distribution Detection
Lars Doorenbos*, Raphael Sznitman, Pablo Márquez Neila
[pdf]
[DOI]

Dynamic Retraining-Updating Mean Teacher for Source-Free Object Detection
Trinh Le Ba Khanh*, Huy-Hung Nguyen, Long Hoang Pham, Duong Nguyen-Ngoc Tran, Jae Wook Jeon*
[pdf]
[DOI]

Knowledge-enhanced Visual-Language Pretraining for Computational Pathology
Xiao Zhou, Xiaoman Zhang, Chaoyi Wu, Ya Zhang, Weidi Xie, Yan-Feng Wang*
[pdf]
[DOI]

Adaptive Multi-modal Fusion of Spatially Variant Kernel Refinement with Diffusion Model for Blind Image Super-Resolution
Junxiong Lin*, Yan Wang, Zeng Tao, Boyang Wang, Qing Zhao, Haoran Wang, Xuan Tong, Xinji Mai, Yuxuan Lin, Wei Song, Jiawen Yu, Shaoqi Yan, Wenqiang Zhang
[pdf]
[DOI]

Disentangled Clothed Avatar Generation from Text Descriptions
Jionghao Wang*, Yuan Liu, Zhiyang Dou, Zhengming Yu, Yongqing Liang, Cheng Lin, Rong Xie, Li Song*, Xin Li, Wenping Wang*
[pdf]
[DOI]

Real Appearance Modeling for More General Deepfake Detection
Jiahe Tian, Cai Yu, Xi Wang, Peng Chen, Zihao Xiao, Jiao Dai, Yesheng Chai*, Jizhong Han
[pdf]
[DOI]

6DGS: 6D Pose Estimation from a Single Image and a 3D Gaussian Splatting Model
Matteo Bortolon*, Theodore Tsesmelis, Stuart James, Fabio Poiesi, Alessio Del Bue
[pdf]
[DOI]

Dual-Decoupling Learning and Metric-Adaptive Thresholding for Semi-Supervised Multi-Label Learning
Jia-Hao Xiao, Ming-Kun Xie, Heng-Bo Fan, Gang Niu, Masashi Sugiyama, Sheng-Jun Huang*
[pdf]
[DOI]

V2X-Real: a Largs-Scale Dataset for Vehicle-to-Everything Cooperative Perception
Hao Xiang, Xin Xia, Zhaoliang Zheng, Runsheng Xu, Letian Gao, Zewei Zhou, xu han, Xinkai Ji, Mingxi Li, Zonglin Meng, Li Jin, Mingyue Lei, Zhaoyang Ma, Zihang He, Haoxuan Ma, Yunshuang Yuan, Yingqian Zhao, Jiaqi Ma*
[pdf]
[DOI]

VQ-HPS: Human Pose and Shape Estimation in a Vector-Quantized Latent Space
Guénolé Fiche*, Simon Leglaive, Xavier Alameda-Pineda, Antonio Agudo, Francesc Moreno
[pdf]
[DOI]

Attention Beats Linear for Fast Implicit Neural Representation Generation
Shuyi Zhang, Ke Liu, Jingjun Gu, Xiaoxu Cai, Zhihua Wang, Jiajun Bu, Haishuai Wang*
[pdf]
[DOI]

HARIVO: Harnessing Text-to-Image Models for Video Generation
Mingi Kwon, Seoung Wug Oh, Yang Zhou, Joon-Young Lee, Difan Liu, Haoran Cai, Baqiao Liu, Feng Liu, Youngjung Uh*
[pdf]
[DOI]

Deep Online Probability Aggregation Clustering
Yuxuan Yan, Na Lu*, Ruofan Yan
[pdf]
[DOI]

WRIM-Net: Wide-Ranging Information Mining Network for Visible-Infrared Person Re-Identification
Yonggan Wu, Ling-Chao Meng*, Yuan Zichao, Sixian Chan, Hong-Qiang Wang*
[pdf]
[DOI]

Reliable and Efficient Concept Erasure of Text-to-Image Diffusion Models
Chao Gong*, Kai Chen, Zhipeng Wei, Jingjing Chen*, Yu-Gang Jiang
[pdf]
[DOI]

Visual Text Generation in the Wild
Yuanzhi Zhu, Jiawei Liu, Feiyu Gao, Wenyu Liu*, Xinggang Wang, Peng Wang, Fei Huang, Cong Yao, Zhibo Yang*
[pdf]
[DOI]

Length-Aware Motion Synthesis via Latent Diffusion
Alessio Sampieri*, Alessio Palma, Indro Spinelli, Fabio Galasso
[pdf]
[DOI]

Attention-Challenging Multiple Instance Learning for Whole Slide Image Classification
Yunlong Zhang*, Honglin Li, YUXUAN SUN, Chenglu Zhu, Sunyi Zheng, Lin Yang*
[pdf]
[DOI]

An Optimal Control View of LoRA and Binary Controller Design for Vision Transformers
Chi Zhang*, Jingpu Cheng, Qianxiao Li
[pdf]
[DOI]

Exploring Phrase-Level Grounding with Text-to-Image Diffusion Model
Danni Yang, Ruohan Dong, Jiayi Ji, Yiwei Ma, Haowei Wang, Xiaoshuai Sun*, Rongrong Ji
[pdf]
[DOI]

FocusDiffuser: Perceiving Local Disparities for Camouflaged Object Detection
Jianwei Zhao*, Xin Li, Fan Yang, Qiang Zhai*, Ao Luo, Zhicheng Jiao, Hong Cheng
[pdf]
[DOI]

Improving image synthesis with diffusion-negative sampling
Alakh Desai*, Nuno Vasconcelos
[pdf]
[DOI]

AvatarPose: Avatar-guided 3D Pose Estimation of Close Human Interaction from Sparse Multi-view Videos
Feichi Lu*, Zijian Dong*, Jie Song, Otmar Hilliges
[pdf]
[DOI]

FedVAD: Enhancing Federated Video Anomaly Detection with GPT-Driven Semantic Distillation
Fan Qi*, Ruijie Pan, Huaiwen Zhang, Changsheng Xu*
[pdf]
[DOI]

SignGen: End-to-End Sign Language Video Generation with Latent Diffusion
Fan Qi*, Yu Duan, Changsheng Xu, Huaiwen Zhang*
[pdf]
[DOI]

"Idling Neurons, Appropriately Lenient Workload During Fine-tuning Leads to Better Generalization"
Hongjing Niu*, Hanting Li, Bin Li, Feng Zhao*
[pdf]
[DOI]

Diffusion Prior-Based Amortized Variational Inference for Noisy Inverse Problems
Sojin Lee, Dogyun Park, Inho Kong, Hyunwoo J. Kim*
[pdf]
[DOI]

The Gaussian Discriminant Variational Autoencoder (GdVAE): A Self-Explainable Model with Counterfactual Explanations
Anselm Haselhoff*, Kevin Trelenberg, Fabian Küppers, Jonas Schneider
[pdf]
[DOI]

Accelerating Image Generation with Sub-path Linear Approximation Model
Chen Xu, Tianhui Song, Weixin Feng, Xubin Li, Tiezheng Ge, Bo Zheng, Limin Wang*
[pdf]
[DOI]

Safe-CLIP: Removing NSFW Concepts from Vision-and-Language Models
Samuele Poppi*, Tobia Poppi*, Federico Cocchi, Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara
[pdf]
[DOI]

TetraDiffusion: Tetrahedral Diffusion Models for 3D Shape Generation
Nikolai Kalischek*, Torben Peters, Jan Dirk Wegner, Konrad Schindler
[pdf]
[DOI]

Camera Calibration using a Collimator System
Shunkun Liang, Banglei Guan*, Zhenbao Yu, Pengju Sun, Yang Shang
[pdf]
[DOI]

Label-free Neural Semantic Image Synthesis
Jiayi Wang*, Kevin A Laube, Yumeng Li, Jan Hendrik Metzen, Shin-I Cheng, Julio Borges, Anna Khoreva
[pdf]
[DOI]

Exploring Reliable Matching with Phase Enhancement for Night-time Semantic Segmentation
Yuwen Pan*, Rui Sun, Naisong Luo, Tianzhu Zhang, Yongdong Zhang
[pdf]
[DOI]

Multiscale Sliced Wasserstein Distances as Perceptual Color Difference Measures
Jiaqi He, Zhihua Wang, Leon Wang, Tsein-I Liu, Yuming Fang, Qilin Sun*, Kede Ma
[pdf]
[DOI]

DiscoMatch: Fast Discrete Optimisation for Geometrically Consistent 3D Shape Matching
Paul Roetzer*, Ahmed Abbas*, Dongliang Cao, Florian Bernard, Paul Swoboda
[pdf]
[DOI]

Switch Diffusion Transformer: Synergizing Denoising Tasks with Sparse Mixture-of-Experts
Byeongjun Park, Hyojun Go, Jin-Young Kim, Sangmin Woo, Seokil Ham, Changick Kim*
[pdf]
[DOI]

"FARSE-CNN: Fully Asynchronous, Recurrent and Sparse Event-Based CNN"
Riccardo Santambrogio*, Marco Cannici, Matteo Matteucci
[pdf]
[DOI]

ConDense: Consistent 2D-3D Pre-training for Dense and Sparse Features from Multi-View Images
Xiaoshuai Zhang*, Zhicheng Wang, Howard Zhou, Soham Ghosh, Danushen L Gnanapragasam, Varun Jampani, Hao Su, Leonidas Guibas
[pdf]
[DOI]

MTA-CLIP: Language-Guided Semantic Segmentation with Mask-Text Alignment
Anurag Das*, Xinting Hu, Li Jiang, Bernt Schiele
[pdf]
[DOI]

Event-Aided Time-To-Collision Estimation for Autonomous Driving
Jinghang Li, Bangyan Liao, Xiuyuan Lu, Peidong Liu, Shaojie Shen, Yi Zhou*
[pdf]
[DOI]

The Devil is in the Statistics: Mitigating and Exploiting Statistics Difference for Generalizable Semi-supervised Medical Image Segmentation
Muyang Qiu, Jian Zhang, Lei Qi, Qian Yu, Yinghuan Shi*, Yang Gao
[pdf]
[DOI]

VEON: Vocabulary-Enhanced Occupancy Prediction
Jilai Zheng, Pin Tang, Zhongdao Wang, Guoqing Wang, Xiangxuan Ren, Bailan Feng, Chao Ma*
[pdf]
[DOI]

Adapt without Forgetting: Distill Proximity from Dual Teachers in Vision-Language Models
Mengyu Zheng*, Yehui Tang, Zhiwei Hao, Kai Han, Yunhe Wang, Chang Xu*
[pdf]
[DOI]

The Sky's the Limit: Relightable Outdoor Scenes via a Sky-pixel Constrained Illumination Prior and Outside-In Visibility
James A D Gardner*, Evgenii Kashin, Bernhard Egger, William Smith
[pdf]
[DOI]

DiffFAS: Face Anti-Spoofing via Generative Diffusion Models
Xinxu Ge, Xin Liu*, Zitong Yu*, Jingang Shi, Chun Qi, Jie Li, Heikki Kälviäinen
[pdf]
[DOI]

Hetecooper: Feature Collaboration Graph for Heterogeneous Collaborative Perception
Congzhang Shao, Guiyang Luo*, Quan Yuan*, Yifu Chen, Yilin Liu, Gong Kexin, Jinglin Li
[pdf]
[DOI]

Learning-based Axial Video Motion Magnification
Kwon Byung-Ki, Oh Hyun-Bin, Kim Jun-Seong, Hyunwoo Ha, Tae-Hyun Oh*
[pdf]
[DOI]

Simplifying Source-Free Domain Adaptation for Object Detection: Effective Self-Training Strategies and Performance Insights
Yan Hao, Florent Forest*, Olga Fink
[pdf]
[DOI]

Class-Incremental Learning with CLIP: Adaptive Representation Adjustment and Parameter Fusion
Linlan Huang, Xusheng Cao, Haori Lu, Xialei Liu*
[pdf]
[DOI]

cDP-MIL: Robust Multiple Instance Learning via Cascaded Dirichlet Process
Yihang Chen, Tsai Hor Chan, Guosheng Yin, Yuming Jiang, Lequan Yu*
[pdf]
[DOI]

Causality-inspired Discriminative Feature Learning in Triple Domains for Gait Recognition
Haijun Xiong, Bin Feng*, Xinggang Wang, Wenyu Liu
[pdf]
[DOI]

Retargeting Visual Data with Deformation Fields
Tim Elsner*, Julia Berger, Tong Wu, Victor Czech, Lin Gao, Leif Kobbelt
[pdf]
[DOI]

Delving Deep into Engagement Prediction of Short Videos
dasong Li, Wenjie Li, Baili Lu, Hongsheng Li, Sizhuo Ma, Gurunandan Krishnan, Jian Wang*
[pdf]
[DOI]

Flexible Distribution Alignment: Towards Long-tailed Semi-supervised Learning with Proper Calibration
Emanuel Sanchez Aimar*, Nathaniel D Helgesen, Yonghao Xu, Marco Kuhlmann, Michael Felsberg
[pdf]
[DOI]

CLEO: Continual Learning of Evolving Ontologies
Shishir Muralidhara*, Saqib Bukhari, Georg Dr. Schneider, Didier Stricker, René Schuster
[pdf]
[DOI]

SpecFormer: Guarding Vision Transformer Robustness via Maximum Singular Value Penalization
Xixu Hu, Runkai Zheng, Jindong Wang*, Cheuk Hang Leung, Qi Wu*, Xing Xie
[pdf]
[DOI]

Wavelet Convolutions for Large Receptive Fields
Shahaf E Finder*, Roy Amoyal, Eran Treister, Oren Freifeld*
[pdf]
[DOI]

"BK-SDM: A Lightweight, Fast, and Cheap Version of Stable Diffusion"
Bo-Kyeong Kim*, Hyoung-Kyu Song, Thibault Castells, Shinkook Choi
[pdf]
[DOI]

Language-Assisted Skeleton Action Understanding for Skeleton-Based Temporal Action Segmentation
Haoyu Ji, Bowen Chen, Xinglong Xu, Weihong Ren, Zhiyong Wang*, Honghai Liu
[pdf]
[DOI]

Leveraging scale- and orientation-covariant features for planar motion estimation
Marcus Valtonen Örnhag*, Alberto Jaenal
[pdf]
[DOI]

Understanding and Mitigating Human-Labelling Errors in Supervised Contrastive Learning
Zijun Long*, Lipeng Zhuang, George W Killick, Richard Mccreadie, Gerardo Aragon-Camarasa, Paul Henderson
[pdf]
[DOI]

Adaptive Parametric Activation
Konstantinos P Alexandridis*, Jiankang Deng, Anh Nguyen, Shan Luo
[pdf]
[DOI]

Distractor-Free Novel View Synthesis via Exploiting Memorization Effect in Optimization
Yukun Wang*, Kunhong Li, Minglin Chen, Longguang Wang, Shunbo Zhou, Kaiwen Xue, Yulan Guo*
[pdf]
[DOI]

VEGS: View Extrapolation of Urban Scenes in 3D Gaussian Splatting using Learned Priors
Sungwon Hwang, Min-Jung Kim, Taewoong Kang, Jayeon Kang, Jaegul Choo*
[pdf]
[DOI]

HGL: Hierarchical Geometry Learning for Test-time Adaptation in 3D Point Cloud Segmentation
Tianpei Zou, Sanqing Qu, Zhijun Li, Alois C. Knoll, 何 良华, Guang Chen*, Changjun Jiang
[pdf]
[DOI]

SWinGS: Sliding Windows for Dynamic 3D Gaussian Splatting
Richard Shaw*, Michal Nazarczuk, Jifei Song, Arthur Moreau, Sibi Catley-Chandar, Helisa Dhamo, Eduardo Pérez Pellitero
[pdf]
[DOI]

Temporal-Mapping Photography for Event Cameras
Yuhan Bao, Lei Sun*, Yuqin Ma, Kaiwei Wang*
[pdf]
[DOI]

Shape2Scene: 3D Scene Representation Learning Through Pre-training on Shape Data
Tuo Feng, Wenguan Wang, Ruijie Quan, Yi Yang*
[pdf]
[DOI]

LineFit: A Geometric Approach for Fitting Line Segments in Images
Marion Boyer, David Youssefi, Florent Lafarge*
[pdf]
[DOI]

Six-Point Method for Multi-Camera Systems with Reduced Solution Space
Banglei Guan, Ji Zhao*, Laurent Kneip
[pdf]
[DOI]

Mew: Multiplexed Immunofluorescence Image Analysis through an Efficient Multiplex Network
Sukwon Yun, Jie Peng, Alexandro E Trevino, Chanyoung Park, Tianlong Chen*
[pdf]
[DOI]

Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance
Shenhao Zhu, Junming Leo Chen, Zuozhuo Dai, Zilong Dong, Yinghui Xu, Xun Cao, Yao Yao, Hao Zhu*, Siyu Zhu*
[pdf]
[DOI]

AdaDistill: Adaptive Knowledge Distillation for Deep Face Recognition
Fadi Boutros*, Vitomir Struc, Naser Damer
[pdf]
[DOI]

HERGen: Elevating Radiology Report Generation with Longitudinal Data
Fuying Wang, Shenghui Du, Lequan Yu*
[pdf]
[DOI]

Labeled Data Selection for Category Discovery
Bingchen Zhao*, Nico Lang, Serge Belongie, Oisin Mac Aodha*
[pdf]
[DOI]

Dependency-aware Differentiable Neural Architecture Search
Buang Zhang*, Xinle Wu, Hao Miao, Bin Yang, Chenjuan Guo
[pdf]
[DOI]

WAS: Dataset and Methods for Artistic Text Segmentation
Xudong Xie, Yuzhe Li, Yang Liu, Zhifei Zhang, Zhaowen Wang, Wei Xiong, Xiang Bai*
[pdf]
[DOI]

CLIFF: Continual Latent Diffusion for Open-Vocabulary Object Detection
Wuyang Li, Xinyu Liu, Jiayi Ma, Yixuan Yuan*
[pdf]
[DOI]

GMT: Enhancing Generalizable Neural Rendering via Geometry-Driven Multi-Reference Texture Transfer
Youngho Yoon, Hyun-Kurl Jang, Kuk-Jin Yoon*
[pdf]
[DOI]

Norface: Improving Facial Expression Analysis by Identity Normalization
Hanwei Liu*, Rudong An, Zhimeng Zhang, Bowen Ma, Wei Zhang, Yan Song, Yujing Hu, Chen Wei, Yu Ding*
[pdf]
[DOI]

Unlocking Attributes' Contribution to Successful Camouflage: A Combined Textual and Visual Analysis Strategy
Hong Zhang, Yixuan Lyu, Qian Yu, Hanyang Liu, Huimin Ma, Yuan Ding, Yifan Yang*
[pdf]
[DOI]

SNeRV: Spectra-preserving Neural Representation for Video
Jina Kim*, Jihoo Lee*, Jewon Kang*
[pdf]
[DOI]

COMO: Compact Mapping and Odometry
Eric Dexheimer*, Andrew Davison
[pdf]
[DOI]

OAT: Object-Level Attention Transformer for Gaze Scanpath Prediction
Yini Fang*, Jingling Yu, Haozheng Zhang, Ralf van der Lans, Bertram E Shi
[pdf]
[DOI]

SelfSwapper: Self-Supervised Face Swapping via Shape Agnostic Masked AutoEncoder
Jaeseong Lee*, Junha Hyung*, Sohyun Jeong, Jaegul Choo*
[pdf]
[DOI]

EgoPoseFormer: A Simple Baseline for Stereo Egocentric 3D Human Pose Estimation
Chenhongyi Yang*, Anastasia Tkach, Shreyas Hampali, Linguang Zhang, Elliot J Crowley, Cem Keskin
[pdf]
[DOI]

An Information Theoretical View for Out-Of-Distribution Detection
Hu Jinjing, Wenrui Liu, Hong Chang*, Bingpeng MA, Shiguang Shan, Xilin Chen
[pdf]
[DOI]

DMiT: Deformable Mipmapped Tri-Plane Representation for Dynamic Scenes
Jing-Wen Yang, Jia-Mu Sun, Yong-Liang Yang, Jie Yang, Ying Shan, Yan-Pei Cao, Lin Gao*
[pdf]
[DOI]

Gated Temporal Diffusion for Stochastic Long-term Dense Anticipation
Olga Zatsarynna*, Emad Bahrami*, Yazan Abu Farha, Gianpiero Francesca, Jürgen Gall*
[pdf]
[DOI]

Gradient-Aware for Class-Imbalanced Semi-supervised Medical Image Segmentation
Wenbo Qi, Jiafei Wu*, S. C. Chan*
[pdf]
[DOI]

HowToCaption: Prompting LLMs to Transform Video Annotations at Scale
Nina Shvetsova*, Anna Kukleva, Xudong Hong, Christian Rupprecht, Bernt Schiele, Hilde Kuehne
[pdf]
[DOI]

LabelDistill: Label-guided Cross-modal Knowledge Distillation for Camera-based 3D Object Detection
Sanmin Kim, Youngseok Kim, Sihwan Hwang, Hyeonjun Jeong, Dongsuk Kum*
[pdf]
[DOI]

Beyond the Data Imbalance: Employing the Heterogeneous Datasets for Vehicle Maneuver Prediction
Hyeongseok Jeon, Sanmin Kim, Abi Rahman Syamil, Junsoo Kim, Dongsuk Kum*
[pdf]
[DOI]

On Pretraining Data Diversity for Self-Supervised Learning
Hasan Abed Al Kader Hammoud*, Tuhin Das, Fabio Pizzati*, Philip Torr, Adel Bibi, Bernard Ghanem
[pdf]
[DOI]

Look Around and Learn: Self-Training Object Detection by Exploration
Gianluca Scarpellini*, Stefano Rosa*, Pietro Morerio, Lorenzo Natale, Alessio Del Bue
[pdf]
[DOI]

Bayesian Self-Training for Semi-Supervised 3D Segmentation
Ozan Unal*, Christos Sakaridis, Luc Van Gool
[pdf]
[DOI]

Motion and Structure from Event-based Normal Flow
Zhongyang Ren, Bangyan Liao, Delei Kong, Jinghang Li, Peidong Liu, Laurent Kneip, Guillermo Gallego, Yi Zhou*
[pdf]
[DOI]

ParCo: Part-Coordinating Text-to-Motion Synthesis
Qiran Zou, Shangyuan Yuan, Shian Du, Yu Wang, Chang Liu, Yi Xu, Jie Chen, Xiangyang Ji*
[pdf]
[DOI]

Learning to Complement and to Defer to Multiple Users
Zheng Zhang, Wenjie Ai, Kevin Wells, David M Rosewarne, Thanh-Toan Do, Gustavo Carneiro*
[pdf]
[DOI]

Tiny Models are the Computational Saver for Large Models
Qingyuan Wang*, Barry Cardiff, Antoine Frappé, Benoit Larras, Deepu John*
[pdf]
[DOI]

DragVideo: Interactive Drag-style Video Editing
Yufan Deng, Ruida WANG, Yuhao ZHANG, Yu-Wing Tai*, Chi-Keung Tang*
[pdf]
[DOI]

Multi-Sentence Grounding for Long-term Instructional Video
Zeqian Li, Qirui Chen, Tengda Han, Ya Zhang, Yan-Feng Wang, Weidi Xie*
[pdf]
[DOI]

Do Generalised Classifiers really work on Human Drawn Sketches?
Hmrishav Bandyopadhyay*, Pinaki Nath Chowdhury, Aneeshan Sain, Subhadeep Koley, Tao Xiang, Ayan Kumar Bhunia, Yi-Zhe Song
[pdf]
[DOI]

KMTalk: Speech-Driven 3D Facial Animation with Key Motion Embedding
Zhihao Xu, Shengjie Gong, Jiapeng Tang, Lingyu Liang, Yining Huang, Haojie Li, Shuangping Huang*
[pdf]
[DOI]

Head360: Learning a Parametric 3D Full-Head for Free-View Synthesis in 360°
Yuxiao He, Yiyu Zhuang, Yanwen Wang, Yao Yao, Siyu Zhu, Xiaoyu Li, Qi Zhang, Xun Cao, Hao Zhu*
[pdf]
[DOI]

MotionDirector: Motion Customization of Text-to-Video Diffusion Models
Rui Zhao, Yuchao Gu, Jay Zhangjie Wu, David Junhao Zhang, Jia-Wei Liu, weijia wu, Jussi Keppo, Mike Zheng Shou*
[pdf]
[DOI]

Text2LiDAR: Text-guided LiDAR Point Clouds Generation via Equirectangular Transformer
Yang Wu*, Kaihua Zhang, Jianjun Qian, Jin Xie*, Jian Yang
[pdf]
[DOI]

Enhanced Motion Forecasting with Visual Relation Reasoning
Sungjune Kim, Hadam Baek, Seunggwan Lee, Hyung-gun Chi, Hyerin Lim, Jinkyu Kim*, Sangpil Kim*
[pdf]
[DOI]

Rate-Distortion-Cognition Controllable Versatile Neural Image Compression
Jinming Liu*, Ruoyu Feng, Yunpeng Qi, Qiuyu Chen, Zhibo Chen, Wenjun Zeng, Xin Jin
[pdf]
[DOI]

Temporal As a Plugin: Unsupervised Video Denoising with Pre-Trained Image Denoisers
Zixuan Fu*, Lanqing Guo, Chong Wang, Yufei Wang, Zhihao Li, Bihan Wen
[pdf]
[DOI]

LiDAR-based All-weather 3D Object Detection via Prompting and Distilling 4D Radar
Yujeong Chae, Hyeonseong Kim, Changgyoon Oh, Minseok Kim, Kuk-Jin Yoon*
[pdf]
[DOI]

MM-SafetyBench: A Benchmark for Safety Evaluation of Multimodal Large Language Models
Xin Liu*, Yichen Zhu, Jindong Gu, Yunshi Lan, Chao Yang, Yu Qiao
[pdf]
[DOI]

Post-training Quantization with Progressive Calibration and Activation Relaxing for Text-to-Image Diffusion Models
Siao Tang, Xin Wang*, Hong Chen, Chaoyu Guan, Zewen Wu, Yansong Tang, Wenwu Zhu*
[pdf]
[DOI]

Scene Coordinate Reconstruction: Posing of Image Collections via Incremental Learning of a Relocalizer
Eric Brachmann*, Jamie Wynn, Shuai Chen, Tommaso Cavallari, Aron Monszpart, Daniyar Turmukhambetov, Victor Adrian Prisacariu
[pdf]
[DOI]

Diffusion Models are Geometry Critics: Single Image 3D Editing Using Pre-Trained Diffusion Priors
Ruicheng Wang*, Jianfeng Xiang, Jiaolong Yang, Xin Tong
[pdf]
[DOI]

Weakly Supervised Co-training with Swapping Assignments for Semantic Segmentation
Xinyu Yang*, Hossein Rahmani, Dame S Black, Bryan M Williams
[pdf]
[DOI]

StoryImager: A Unified and Efficient Framework for Coherent Story Visualization and Completion
Ming Tao*, Bingkun Bao*, Hao Tang, Yaowei Wang, Changsheng Xu
[pdf]
[DOI]

ST-LLM: Large Language Models Are Effective Temporal Learners
Ruyang Liu, Chen Li, Haoran Tang, Yixiao Ge, Ying Shan, Ge Li*
[pdf]
[DOI]

Exact Diffusion Inversion via Bidirectional Integration Approximation
Guoqiang Zhang*, j.p. lewis, W. Bastiaan Kleijn
[pdf]
[DOI]

Textual Query-Driven Mask Transformer for Domain Generalized Segmentation
Byeonghyun Pak, Byeongju Woo, Sunghwan Kim, Dae-hwan Kim, Hoseong Kim*
[pdf]
[DOI]

EmoTalk3D: High-Fidelity Free-View Synthesis of Emotional 3D Talking Head
Qianyun He, Xinya Ji, Yicheng Gong, Yuanxun Lu, Zhengyu Diao, Linjia Huang, Yao Yao, Siyu Zhu, Zhan Ma, Songcen Xu, Xiaofei Wu, Zixiao Zhang, Xun Cao, Hao Zhu*
[pdf]
[DOI]

Arbitrary-Scale Video Super-Resolution with Structural and Textural Priors
Wei Shang*, Dongwei Ren*, Wanying Zhang, Yuming Fang, Wangmeng Zuo, Kede Ma
[pdf]
[DOI]

Object-Centric Diffusion for Efficient Video Editing
Kumara Kahatapitiya*, Adil Karjauv, Davide Abati*, Fatih Porikli, Yuki M Asano, Amirhossein Habibian
[pdf]
[DOI]

Single-Mask Inpainting for Voxel-based Neural Radiance Fields
Jiafu Chen*, Tianyi Chu, Jiakai Sun, Wei Xing, Lei Zhao
[pdf]
[DOI]

McGrids: Monte Carlo-Driven Adaptive Grids for Iso-Surface Extraction
Daxuan Ren*, Hezi Shi, Jianmin Zheng, Jianfei Cai
[pdf]
[DOI]

Freeview Sketching: View-Aware Fine-Grained Sketch-Based Image Retrieval
Aneeshan Sain*, Pinaki Nath Chowdhury, Subhadeep Koley, Ayan Kumar Bhunia, Yi-Zhe Song
[pdf]
[DOI]

Adapt2Reward: Adapting Video-Language Models to Generalizable Robotic Rewards via Failure Prompts
Yanting Yang, Minghao Chen*, Qibo Qiu, Jiahao WU, Wenxiao Wang, Binbin Lin, Ziyu Guan, Xiaofei He
[pdf]
[DOI]

Diffusion for Natural Image Matting
Yihan Hu*, Yiheng Lin, Wei Wang, Yao Zhao, Yunchao Wei*, Humphrey Shi
[pdf]
[DOI]

Agglomerative Token Clustering
Joakim Bruslund Haurum*, Sergio Escalera, Graham W. Taylor*, Thomas B. Moeslund
[pdf]
[DOI]

CMD: A Cross Mechanism Domain Adaptation Dataset for 3D Object Detection
Jinhao Deng, Wei Ye, Hai Wu, Qiming Xia, Xun Huang, Xin Li, Jin Fang, Wei Li*, Chenglu Wen*, Cheng Wang
[pdf]
[DOI]

Unleashing Text-to-Image Diffusion Prior for Zero-Shot Image Captioning
Jianjie Luo, Jingwen Chen, Yehao Li, Yingwei Pan*, Jianlin Feng, Hongyang Chao, Ting Yao
[pdf]
[DOI]

ClusteringSDF: Self-Organized Neural Implicit Surfaces for 3D Decomposition
Tianhao Wu*, Chuanxia Zheng, Qianyi Wu, Tat-Jen Cham
[pdf]
[DOI]

NAMER: Non-Autoregressive Modeling for Handwritten Mathematical Expression Recognition
Chenyu Liu, Jia Pan, Jinshui Hu, Baocai Yin, Bing Yin, Mingjun Chen, Cong Liu, Jun Du*, Qingfeng Liu
[pdf]
[DOI]

GIVT: Generative Infinite-Vocabulary Transformers
Michael Tschannen*, Cian Eastwood, Fabian Mentzer
[pdf]
[DOI]

Mismatch Quest: Visual and Textual Feedback for Image-Text Misalignment
Brian Gordon*, Yonatan Bitton*, Yonatan Shafir, Roopal Garg, Xi Chen, Dani Lischinski, Daniel Cohen-Or, Idan Szpektor
[pdf]
[DOI]

Regulating Model Reliance on Non-Robust Features by Smoothing Input Marginal Density
Peiyu Yang*, Naveed Akhtar, Mubarak Shah, Ajmal Mian
[pdf]
[DOI]

Multi-Modal Video Dialog State Tracking in the Wild
Adnen Abdessaied*, Lei Shi, Andreas Bulling
[pdf]
[DOI]

Factorized Diffusion: Perceptual Illusions by Noise Decomposition
Daniel Geng*, Inbum Park, Andrew Owens
[pdf]
[DOI]

To Generate or Not? Safety-Driven Unlearned Diffusion Models Are Still Easy To Generate Unsafe Images ... For Now
Yimeng Zhang*, jinghan jia, Xin Chen, Aochuan Chen, Yihua Zhang, Jiancheng Liu, Ke Ding, Sijia Liu
[pdf]
[DOI]

Dissecting Dissonance: Benchmarking Large Multimodal Models Against Self-Contradictory Instructions
Jin Gao, Lei Gan, Yuankai Li, Yixin Ye, Dequan Wang*
[pdf]
[DOI]

StereoGlue: Joint Feature Matching and Robust Estimation
Daniel Barath*, Dmytro Mishkin, Luca Cavalli, Paul-Edouard Sarlin, Petr Hruby, Marc Pollefeys
[pdf]
[DOI]

Boosting Transferability in Vision-Language Attacks via Diversification along the Intersection Region of Adversarial Trajectory
Sensen Gao, Xiaojun Jia*, Xuhong Ren, Ivor Tsang, Qing Guo*
[pdf]
[DOI]

Leveraging Enhanced Queries of Point Sets for Vectorized Map Construction
Zihao Liu, Xiaoyu Zhang, Guangwei Liu, Ji Zhao*, Ningyi Xu*
[pdf]
[DOI]

Robust Zero-Shot Crowd Counting and Localization with Adaptive Resolution SAM
Jia Wan*, Qiangqiang Wu, Wei Lin, Antoni Chan
[pdf]
[DOI]

AWOL: Analysis WithOut synthesis using Language
Silvia Zuffi*, Michael J. Black
[pdf]
[DOI]

OneVOS: Unifying Video Object Segmentation with All-in-One Transformer Framework
Wanyun Li, Pinxue Guo, Xinyu Zhou, Lingyi Hong, Yangji He, Xiangyu Zheng, Wei Zhang*, Wenqiang Zhang*
[pdf]
[DOI]

M3DBench: Towards Omni 3D Assistant with Interleaved Multi-modal Instructions
Mingsheng Li, Xin Chen, Chi Zhang, Sijin Chen, Hongyuan Zhu, Fukun Yin, Zhuoyuan Li, Gang Yu, Tao Chen*
[pdf]
[DOI]

MSD: A Benchmark Dataset for Floor Plan Generation of Building Complexes
Casper van Engelenburg*, Fatemeh Mostafavi, Emanuel Kuhn, Yuntae Jeon, Michael Franzen, Matthias Standfest, Jan van Gemert, Seyran Khademi
[pdf]
[DOI]

End-to-End Rate-Distortion Optimized 3D Gaussian Representation
Henan Wang*, Hanxin Zhu, Tianyu He, Runsen Feng, Jiajun Deng, Jiang Bian, Zhibo Chen
[pdf]
[DOI]

Temporal Residual Jacobians for Rig-free Motion Transfer
Sanjeev Muralikrishnan*, Niladri Shekhar Dutt, Siddhartha Chaudhuri, Noam Aigerman, Vladimir Kim, Matthew Fisher, Niloy Mitra
[pdf]
[DOI]

LetsMap: Unsupervised Representation Learning for Label-Efficient Semantic BEV Mapping
Nikhil Gosala*, Kürsat Petek, B Ravi Kiran, Senthil Yogamani, Paulo L. J. Drews-Jr, Wolfram Burgard, Abhinav Valada
[pdf]
[DOI]

Deblurring 3D Gaussian Splatting
Byeonghyeon Lee*, Howoong Lee, Xiangyu Sun, Usman Ali, Eunbyung Park*
[pdf]
[DOI]

Taming Lookup Tables for Efficient Image Retouching
Sidi Yang, Binxiao Huang, Mingdeng Cao, Yatai Ji, Hanzhong Guo, Ngai Wong, Yujiu Yang*
[pdf]
[DOI]

DualDn: Dual-domain Denoising via Differentiable ISP
Ruikang Li, Yujin Wang*, Shiqi Chen, Fan Zhang, Jinwei Gu, Tianfan Xue
[pdf]
[DOI]

Quantization-Friendly Winograd Transformations for Convolutional Neural Networks
Vladimir Protsenko*, Vladimir Kryzhanovskiy, Alexander Filippov
[pdf]
[DOI]

A Task is Worth One Word: Learning with Task Prompts for High-Quality Versatile Image Inpainting
Junhao Zhuang, Yanhong Zeng, WENRAN LIU, Chun Yuan*, Kai Chen*
[pdf]
[DOI]

Self-supervised Shape Completion via Involution and Implicit Correspondences
Mengya Liu*, Ajad Chhatkuli, Janis Postels, Luc Van Gool, Federico Tombari
[pdf]
[DOI]

From Fake to Real: Pretraining on Balanced Synthetic Images to Prevent Spurious Correlations in Image Recognition
Maan Qraitem*, Kate Saenko, Bryan A. Plummer
[pdf]
[DOI]

Cross-Domain Few-Shot Object Detection via Enhanced Open-Set Object Detector
Yuqian Fu*, Yu Wang, Yixuan Pan, Xingyu Qiu, Lian Huai, Zeyu Shangguan, Tong Liu, Yanwei Fu, Luc Van Gool, Xingqun Jiang
[pdf]
[DOI]

NICP: Neural ICP for 3D Human Registration at Scale
Riccardo Marin*, Enric Corona, Gerard Pons-Moll
[pdf]
[DOI]

PredBench: Benchmarking Spatio-Temporal Prediction across Diverse Disciplines
ZiDong Wang*, Zeyu Lu*, Di Huang*, Tong He, Xihui Liu, Wanli Ouyang, Lei Bai*
[pdf]
[DOI]

FontStudio: Shape-Adaptive Diffusion Model for Coherent and Consistent Font Effect Generation
Xinzhi Mu*, Li Chen, Bohan CHEN, Shuyang Gu, Jianmin Bao, Dong Chen, Ji Li, Yuhui Yuan
[pdf]
[DOI]

Chronologically Accurate Retrieval for Temporal Grounding of Motion-Language Models
Kent Fujiwara*, Mikihiro Tanaka, Qing Yu
[pdf]
[DOI]

StableDrag: Stable Dragging for Point-based Image Editing
Yutao Cui, Xiaotong Zhao, Guozhen Zhang, Shengming Cao, Kai Ma, Limin Wang*
[pdf]
[DOI]

Improving Feature Stability during Upsampling -- Spectral Artifacts and the Importance of Spatial Context
Shashank Agnihotri*, Julia Grabinski, Margret Keuper
[pdf]
[DOI]

Dynamic Data Selection for Efficient SSL via Coarse-to-Fine Refinement
Aditay Tripathi*, Pradeep Shenoy, Anirban Chakraborty
[pdf]
[DOI]

Neural Surface Detection for Unsigned Distance Fields
Federico Stella*, Nicolas Talabot, Hieu Le, Pascal Fua
[pdf]
[DOI]

One-Shot Diffusion Mimicker for Handwritten Text Generation
Gang Dai, Yifan Zhang, Quhui Ke, Qiangya Guo, Shuangping Huang*
[pdf]
[DOI]

Event-Based Motion Magnification
Yutian Chen, Shi Guo*, Yu Fangzheng, Feng Zhang, Jinwei Gu, Tianfan Xue
[pdf]
[DOI]

Improving Neural Surface Reconstruction with Feature Priors from Multi-View Images
Xinlin Ren*, Chenjie Cao, Yanwei Fu*, Xiangyang Xue
[pdf]
[DOI]

Towards Multimodal Sentiment Analysis Debiasing via Bias Purification
Dingkang Yang, Mingcheng Li, Dongling Xiao, Yang Liu, Kun Yang, Zhaoyu Chen, Yuzheng Wang, Peng Zhai*, Ke Li, Lihua Zhang*
[pdf]
[DOI]

Kernel Diffusion: An Alternate Approach to Blind Deconvolution
Yash Sanghvi*, Yiheng Chi, Stanley Chan
[pdf]
[DOI]

MUSES: The Multi-Sensor Semantic Perception Dataset for Driving under Uncertainty
Tim Broedermann*, David Brüggemann, Christos Sakaridis, Kevin Ta, Odysseas Liagouris, Jason Corkill, Luc Van Gool
[pdf]
[DOI]

Discovering Novel Actions from Open World Egocentric Videos with Object-Grounded Visual Commonsense Reasoning
Sanjoy Kundu, Shubham Trehan, Sathyanarayanan N Aakur*
[pdf]
[DOI]

Bidirectional Progressive Transformer for Interaction Intention Anticipation
Zichen Zhang*, Hongchen Luo, Wei Zhai*, Yu Kang, Yang Cao
[pdf]
[DOI]

Reinforcement Learning Meets Visual Odometry
Nico Messikommer*, Giovanni Cioffi, Mathias Gehrig, Davide Scaramuzza
[pdf]
[DOI]

Bucketed Ranking-based Losses for Efficient Training of Object Detectors
Feyza Yavuz*, Baris Can Cam, Adnan Harun Dogan, Kemal Oksuz, Emre Akbas, Sinan Kalkan
[pdf]
[DOI]

Robustness Tokens: Towards Adversarial Robustness of Transformers
Brian Pulfer*, Yury Belousov, Slava Voloshynovskiy
[pdf]
[DOI]

RSL-BA: Rolling Shutter Line Bundle Adjustment
Yongcong Zhang, Bangyan Liao, Yifei Xue, Lu Chen, Peidong Liu, Yizhen Lao*
[pdf]
[DOI]

DecentNeRFs: Decentralized Neural Radiance Fields from Crowdsourced Images
Zaid Tasneem*, Akshat Dave, Abhishek Singh, Kushagra Tiwary, Praneeth Vepakomma, Ashok Veeraraghavan, Ramesh Raskar
[pdf]
[DOI]

DreamMesh: Jointly Manipulating and Texturing Triangle Meshes for Text-to-3D Generation
Haibo Yang, Yang Chen, Yingwei Pan*, Ting Yao, Zhineng Chen, Zuxuan Wu, Yu-Gang Jiang, Tao Mei
[pdf]
[DOI]

Unveiling Typographic Deceptions: Insights of the Typographic Vulnerability in Large Vision-Language Models
Hao Cheng, Erjia Xiao, Jindong Gu, Le Yang, Jinhao Duan, Jize Zhang, Jiahang Cao, Kaidi Xu, Renjing Xu*
[pdf]
[DOI]

N2F2: Hierarchical Scene Understanding with Nested Neural Feature Fields
Yash Bhalgat*, Iro Laina, Joao F Henriques, Andrew Zisserman, Andrea Vedaldi
[pdf]
[DOI]

ConceptExpress: Harnessing Diffusion Models for Single-image Unsupervised Concept Extraction
Shaozhe Hao*, Kai Han*, Zhengyao Lv, Shihao Zhao, Kwan-Yee K. Wong*
[pdf]
[DOI]

PairingNet: A Learning-based Pair-searching and -matching Network for Image Fragments
Rixin Zhou*, Ding Xia, YI ZHANG, honglin pang, Xi Yang, chuntao li
[pdf]
[DOI]

Skeleton-based Group Activity Recognition via Spatial-Temporal Panoramic Graph
Zhengcen Li, Xinle Chang, Yueran Li, Jingyong Su*
[pdf]
[DOI]

Towards Multimodal Open-Set Domain Generalization and Adaptation through Self-supervision
Hao Dong*, Eleni Chatzi*, Olga Fink*
[pdf]
[DOI]

ReCON: Training-Free Acceleration for Text-to-Image Synthesis with Retrieval of Concept Prompt Trajectories
Chen-Yi Lu*, Shubham Agarwal, Md Mehrab Tanjim, Kanak Mahadik, Anup Rao, Subrata Mitra, Shiv K Saini, Saurabh Bagchi, Somali Chaterji
[pdf]
[DOI]

AMES: Asymmetric and Memory-Efficient Similarity Estimation for Instance-level Retrieval
Pavel Suma*, Giorgos Kordopatis-Zilos, Ahmet Iscen, Giorgos Tolias
[pdf]
[DOI]

TCAN: Animating Human Images with Temporally Consistent Pose Guidance using Diffusion Models
Jeongho Kim*, Min-Jung Kim*, Junsoo Lee, Jaegul Choo*
[pdf]
[DOI]

3D Hand Sequence Recovery from Real Blurry Images and Event Stream
JoonKyu Park, Gyeongsik Moon, Weipeng Xu, Evan Kaseman, Takaaki Shiratori, Kyoung Mu Lee*
[pdf]
[DOI]

GlobalPointer: Large-Scale Plane Adjustment with Bi-Convex Relaxation
Bangyan Liao, Zhenjun Zhao, Lu Chen, Haoang Li, Daniel Cremers, Peidong Liu*
[pdf]
[DOI]

Dissolving Is Amplifying: Towards Fine-Grained Anomaly Detection
Jian Shi*, Pengyi Zhang, Ni Zhang, Hakim Ghazzai, Peter Wonka
[pdf]
[DOI]

StyleCity: Large-Scale 3D Urban Scenes Stylization
Yingshu Chen, Huajian Huang*, Tuan-Anh Vu, Ka Chun Shum, Sai-Kit Yeung
[pdf]
[DOI]

ViG-Bias: Visually Grounded Bias Discovery and Mitigation
Badr-Eddine Marani*, Mohamed Hanini, Nihitha Malayarukil, Stergios Christodoulidis, Maria Vakalopoulou, Enzo Ferrante
[pdf]
[DOI]

DiffBIR: Toward Blind Image Restoration with Generative Diffusion Prior
Xinqi Lin*, Jingwen He, Ziyan Chen, Zhaoyang Lyu, Bo Dai, Fanghua Yu, Yu Qiao, Wanli Ouyang, Chao Dong*
[pdf]
[DOI]

Assessing Sample Quality via the Latent Space of Generative Models
Jingyi Xu*, Hieu Le, Dimitris Samaras
[pdf]
[DOI]

Relightable Neural Actor with Intrinsic Decomposition and Pose Control
Diogo Carbonera Luvizon*, Vladislav Golyanik, Adam Kortylewski, Marc Habermann, Christian Theobalt
[pdf]
[DOI]

Sur^2f: A Hybrid Representation for High-Quality and Efficient Surface Reconstruction from Multi-view Images
Zhangjin Huang*, Zhihao Liang, Kui Jia*
[pdf]
[DOI]

HO-Gaussian: Hybrid Optimization of 3D Gaussian Splatting for Urban Scenes
Zhuopeng Li*, Yilin Zhang, Chenming Wu, Jianke Zhu*, Liangjun Zhang
[pdf]
[DOI]

Pseudo-keypoint RKHS Learning for Self-supervised 6DoF Pose Estimation
Yangzheng Wu*, Michael Alan Greenspan
[pdf]
[DOI]

Consistent 3D Line Mapping
Xulong Bai, Hainan Cui*, Shuhan Shen*
[pdf]
[DOI]

Distributed Active Client Selection With Noisy Clients Using Model Association Scores
Kwang In Kim*
[pdf]
[DOI]

PixOOD: Pixel-Level Out-of-Distribution Detection
Tomas Vojir*, Jan Sochman, Jiri Matas
[pdf]
[DOI]

GarmentCodeData: A Dataset of 3D Made-to-Measure Garments With Sewing Patterns
Maria Korosteleva*, Timur Levent Kesdogan, Fabian Kemper, Stephan Wenninger, Jasmin Koller, Yuhan Zhang, Mario Botsch, Olga Sorkine-Hornung
[pdf]
[DOI]

Towards a Density Preserving Objective Function for Learning on Point Sets
Haritha Jayasinghe*, Ioannis Brilakis
[pdf]
[DOI]

AnatoMask: Enhancing Medical Image Segmentation with Reconstruction-guided Self-masking
Yuheng Li, Tianyu Luan, Yizhou Wu, Shaoyan Pan, Yenho Chen, Xiaofeng Yang*
[pdf]
[DOI]

VF-NeRF: Viewshed Fields for Rigid NeRF Registration
Leo Segre*, Shai Avidan
[pdf]
[DOI]

Task-Driven Uncertainty Quantification in Inverse Problems via Conformal Prediction
Jeffrey Wen*, Rizwan Ahmad, Phillip Schniter
[pdf]
[DOI]

Trainable Highly-expressive Activation Functions
Irit Chelly*, Shahaf E. Finder, Shira Ifergane, Oren Freifeld
[pdf]
[DOI]

Region-Aware Sequence-to-Sequence Learning for Hyperspectral Denoising
JiaHua Xiao, Yang Liu, Xing Wei*
[pdf]
[DOI]

Self-Supervised Representation Learning for Adversarial Attack Detection
Yi Li*, Plamen Angelov, Neeraj Suri
[pdf]
[DOI]

Do text-free diffusion models learn discriminative visual representations?
Soumik Mukhopadhyay*, Matthew A Gwilliam*, Yosuke Yamaguchi, Vatsal Agarwal, Namitha Padmanabhan, Archana Swaminathan, Tianyi Zhou, Jun Ohya, Abhinav Shrivastava
[pdf]
[DOI]

Clean & Compact: Efficient Data-Free Backdoor Defense with Model Compactness
Huy Phan*, Jinqi Xiao, Yang Sui, Tianfang Zhang, Zijie Tang, Cong Shi, Yan Wang, Yingying Chen, Bo Yuan
[pdf]
[DOI]

DOCCI: Descriptions of Connected and Contrasting Images
Yasumasa Onoe*, Sunayana Rane, Zachary E Berger, Yonatan Bitton, Jaemin Cho, Roopal Garg, Alexander Ku, Zarana Parekh, Jordi Pont-Tuset, Garrett Tanzer, Su Wang, Jason M Baldridge
[pdf]
[DOI]

EAS-SNN: End-to-End Adaptive Sampling and Representation for Event-based Detection with Recurrent Spiking Neural Networks
Ziming Wang, Ziling Wang, Huaning Li, Lang Qin, Runhao Jiang, De Ma*, Huajin Tang*
[pdf]
[DOI]

AttentionHand: Text-driven Controllable Hand Image Generation for 3D Hand Reconstruction in the Wild
Junho Park, Kyeongbo Kong, Suk-Ju Kang*
[pdf]
[DOI]

Dataset Quantization with Active Learning based Adaptive Sampling
Zhenghao Zhao*, Yuzhang Shang, Junyi Wu, Yan Yan
[pdf]
[DOI]

LogoSticker: Inserting Logos into Diffusion Models for Customized Generation
Mingkang Zhu, Xi CHEN, Zhongdao Wang, Hengshuang Zhao*, Jiaya Jia*
[pdf]
[DOI]

LEROjD: Lidar Extended Radar-Only Object Detection
Patrick Palmer*, Martin Krüger, Stefan Schütte, Richard Altendorfer, Ganesh Adam, Torsten Bertram
[pdf]
[DOI]

"ProCreate, Don't Reproduce! Propulsive Energy Diffusion for Creative Generation"
Jack Lu*, Ryan Teehan*, Mengye Ren*
[pdf]
[DOI]

Match-Stereo-Videos: Bidirectional Alignment for Consistent Dynamic Stereo Matching
Junpeng Jing*, Ye Mao, Krystian Mikolajczyk*
[pdf]
[DOI]

Probabilistic Image-Driven Traffic Modeling via Remote Sensing
Scott Workman*, Armin Hadzic
[pdf]
[DOI]

IntrinsicAnything: Learning Diffusion Priors for Inverse Rendering Under Unknown Illumination
Xi Chen*, Sida Peng, Dongchen Yang, Yuan Liu, Bowen Pan, Chengfei Lyu, Xiaowei Zhou*
[pdf]
[DOI]

VideoStudio: Generating Consistent-Content and Multi-Scene Videos
Fuchen Long, Zhaofan Qiu*, Ting Yao, Tao Mei
[pdf]
[DOI]

Semantic Residual Prompts for Continual Learning
Martin Menabue*, Emanuele Frascaroli, Matteo Boschini, Enver Sangineto, Lorenzo Bonicelli, Angelo Porrello*, SIMONE CALDERARA
[pdf]
[DOI]

TransCAD: A Hierarchical Transformer for CAD Sequence Inference from Point Clouds
Elona Dupont*, Kseniya Cherenkova, Dimitrios Mallis, Gleb A Gusev, Anis Kacem, Djamila Aouada
[pdf]
[DOI]

ViGoR: Improving Visual Grounding of Large Vision Language Models with Fine-Grained Reward Modeling
Siming Yan*, Min Bai, Weifeng Chen, Xiong Zhou, Qixing Huang, Li Erran Li
[pdf]
[DOI]

Mixture of Efficient Diffusion Experts Through Automatic Interval and Sub-Network Selection
Alireza Ganjdanesh*, Yan Kang, Yuchen Liu, Richard Zhang, Zhe Lin, Heng Huang
[pdf]
[DOI]

Occupancy as Set of Points
Yiang Shi, Tianheng Cheng, Qian Zhang, Wenyu Liu, Xinggang Wang*
[pdf]
[DOI]

UAV First-Person Viewers Are Radiance Field Learners
Liqi Yan*, Qifan Wang, Junhan Zhao, Qiang Guan, Zheng Tang, Jianhui Zhang, Dongfang Liu*
[pdf]
[DOI]

Rethinking Few-shot Class-incremental Learning: Learning from Yourself
Yu-Ming Tang, Yi-Xing Peng, Jingke Meng*, Wei-Shi Zheng
[pdf]
[DOI]

ProSub: Probabilistic Open-Set Semi-Supervised Learning with Subspace-Based Out-of-Distribution Detection
Erik Wallin*, Lennart Svensson, Fredrik Kahl, Lars Hammarstrand
[pdf]
[DOI]

A Fair Ranking and New Model for Panoptic Scene Graph Generation
Julian Lorenz*, Alexander Pest, Daniel Kienzle, Katja Ludwig, Rainer Lienhart
[pdf]
[DOI]

Pick-a-back: Selective Device-to-Device Knowledge Transfer in Federated Continual Learning
HyungJune Lee*, JinYi Yoon
[pdf]
[DOI]

Compensation Sampling for Improved Convergence in Diffusion Models
Hui Lu*, Albert Ali Salah, Ronald Poppe
[pdf]
[DOI]

Situated Instruction Following
So Yeon Min*, Xavier Puig, Devendra Singh Chaplot, Tsung-Yen Yang, Priyam Parashar, Akshara Rai, Ruslan Salakhutdinov, Yonatan Bisk, Roozbeh Mottaghi
[pdf]
[DOI]

Holodepth: Programmable Depth-Varying Projection via Computer-Generated Holography
Dorian Chan*, Matthew O'Toole, Sizhuo Ma, Jian Wang*
[pdf]
[DOI]

SceneScript: Reconstructing Scenes With An Autoregressive Structured Language Model
Armen Avetisyan*, Christopher Xie, Henry Howard-Jenkins, Tsun-Yi Yang, Samir Aroudj, Suvam Patra, Fuyang Zhang, Luke Holland, Duncan Frost, Campbell Orme, Jakob Engel, Edward Miller, Richard Newcombe, Vasileios Balntas
[pdf]
[DOI]

GalLop: Learning global and local prompts for vision-language models
Marc Lafon*, Elias Ramzi*, Clément Rambour, Nicolas Audebert, Nicolas Thome
[pdf]
[DOI]

Depth on Demand: Streaming Dense Depth from a Low Frame Rate Active Sensor
Andrea Conti*, Matteo Poggi, Valerio Cambareri, Stefano Mattoccia
[pdf]
[DOI]

Lossy Image Compression with Foundation Diffusion Models
Lucas Relic*, Roberto Azevedo, Markus Gross, Christopher Schroers*
[pdf]
[DOI]

CLIP-DINOiser: Teaching CLIP a few DINO tricks for open-vocabulary semantic segmentation
Monika Wysoczańska*, Oriane Siméoni, Michaël Ramamonjisoa, Andrei Bursuc, Tomasz Trzciński, Patrick Pérez
[pdf]
[DOI]

FMBoost: Boosting Latent Diffusion with Flow Matching
Johannes S Fischer*, Ming Gui, Pingchuan Ma, Nick Stracke, Stefan Andreas Baumann, Vincent Tao Hu, Björn Ommer
[pdf]
[DOI]

COMPOSE: Comprehensive Portrait Shadow Editing
Andrew Z Hou*, Zhixin Shu, Xuaner Zhang, He Zhang, Yannick Hold-Geoffroy, Jae Shin Yoon, Xiaoming Liu
[pdf]
[DOI]

LNL+K: Enhancing Learning with Noisy Labels Through Noise Source Knowledge Integration
Siqi Wang*, Bryan Plummer
[pdf]
[DOI]

Diffusion Models as Data Mining Tools
Ioannis Siglidis*, Aleksander Holynski, Alexei A. Efros, Mathieu Aubry, Shiry Ginosar
[pdf]
[DOI]

Graph Neural Network Causal Explanation via Neural Causal Models
Arman Behnam*, Binghui Wang
[pdf]
[DOI]

"Unsupervised, Online and On-The-Fly Anomaly Detection For Non-Stationary Image Distributions"
Declan GD McIntosh*, Alexandra Branzan Albu
[pdf]
[DOI]

Photorealistic Object Insertion with Diffusion-Guided Inverse Rendering
Ruofan Liang, Zan Gojcic, Merlin Nimier-David, David Acuna, Nandita Vijaykumar, Sanja Fidler, Zian Wang*
[pdf]
[DOI]

GAReT: Cross-view Video Geolocalization with Adapters and Auto-Regressive Transformers
Manu S Pillai*, Mamshad Nayeem Rizve, Mubarak Shah
[pdf]
[DOI]

SAMFusion: Sensor-Adaptive Multimodal Fusion for 3D Object Detection in Adverse Weather
Edoardo Palladin*, Roland Dietze*, Praveen Narayanan, Mario Bijelic, Felix Heide
[pdf]
[DOI]

Generating Physically Realistic and Directable Human Motions from Multi-Modal Inputs
Aayam Shrestha, Pan Liu*, German Ros, Kai Yuan*, Alan Fern
[pdf]
[DOI]

CoTracker: It is Better to Track Together
Nikita Karaev*, Ignacio Rocco, Ben Graham, Natalia Neverova, Andrea Vedaldi, Christian Rupprecht
[pdf]
[DOI]

"SPHINX: A Mixer of Weights, Visual Embeddings and Image Scales for Multi-modal Large Language Models"
Ziyi Lin, Dongyang Liu, Renrui Zhang, Peng Gao*, Longtian Qiu, Han Xiao, Han Qiu, Wenqi Shao, Keqin Chen, Jiaming Han, Siyuan Huang, Yichi Zhang, Xuming He, Yu Qiao*, Hongsheng Li*
[pdf]
[DOI]

PathMMU: A Massive Multimodal Expert-Level Benchmark for Understanding and Reasoning in Pathology
Yuxuan Sun*, Hao Wu, Chenglu Zhu, Sunyi Zheng, Qizi Chen, Kai Zhang, Yunlong Zhang, Dan Wan, Xiaoxiao Lan, Mengyue Zheng, Jingxiong Li, Xinheng Lyu, Tao Lin*, Lin Yang*
[pdf]
[DOI]

Improving Adversarial Transferability via Model Alignment
Avery Ma*, Amir-massoud Farahmand, Yangchen Pan, Philip Torr, Jindong Gu
[pdf]
[DOI]

RealGen: Retrieval Augmented Generation for Controllable Traffic Scenarios
Wenhao Ding*, Yulong Cao, DING ZHAO, Chaowei Xiao, Marco Pavone
[pdf]
[DOI]

ADen: Adaptive Density Representations for Sparse-view Camera Pose Estimation
Hao Tang, Weiyao Wang, Pierre Gleize, Matt Feiszli*
[pdf]
[DOI]

Embodied Understanding of Driving Scenarios
Yunsong Zhou*, Linyan Huang, Qingwen Bu, Jia Zeng, Tianyu Li, Hang Qiu, Hongzi Zhu, Minyi Guo, Yu Qiao, Hongyang Li
[pdf]
[DOI]

Learning to Drive via Asymmetric Self-Play
Chris Zhang*, Sourav Biswas, Kelvin Wong, Kion Fallah, Lunjun Zhang, Dian Chen, Sergio Casas, Raquel Urtasun
[pdf]
[DOI]

OpenIns3D: Snap and Lookup for 3D Open-vocabulary Instance Segmentation
Zhening Huang, Xiaoyang Wu, Xi Chen, Hengshuang Zhao*, Lei Zhu, Joan Lasenby*
[pdf]
[DOI]

ViLA: Efficient Video-Language Alignment for Video Question Answering
Xijun Wang*, Junbang Liang, Chun-Kai Wang, Kenan Deng, Yu Lou, Ming C Lin, Shan Yang
[pdf]
[DOI]

Factorizing Text-to-Video Generation by Explicit Image Conditioning
Rohit Girdhar*, Mannat Singh, Andrew Brown, Quentin Duval, Samaneh Azadi, Sai Saketh Rambhatla, Mian Akbar Shah, Xi Yin, Devi Parikh, Ishan Misra
[pdf]
[DOI]

MobileDiffusion: Instant Text-to-Image Generation on Mobile Devices
Yang Zhao*, Zhisheng Xiao*, Yanwu Xu, Haolin Jia, Tingbo Hou
[pdf]
[DOI]

Open-Set Biometrics: Beyond Good Closed-Set Models
Yiyang Su, Minchul Kim, Feng Liu, Anil Jain, Xiaoming Liu*
[pdf]
[DOI]

UNIT: Backdoor Mitigation via Automated Neural Distribution Tightening
Siyuan Cheng*, Guangyu Shen, Kaiyuan Zhang, Guanhong Tao, Shengwei An, Hanxi Guo, Shiqing Ma, Xiangyu Zhang
[pdf]
[DOI]

Which Model Generated This Image? A Model-Agnostic Approach for Origin Attribution
Fengyuan Liu, Haochen Luo, Yiming Li, Philip Torr, Jindong Gu*
[pdf]
[DOI]

Osmosis: RGBD Diffusion Prior for Underwater Image Restoration
Opher Bar Nathan*, Deborah Levy, Tali Treibitz, Dan Rosenbaum
[pdf]
[DOI]

Towards Adaptive Pseudo-label Learning for Semi-Supervised Temporal Action Localization
Feixiang Zhou, Bryan Williams, Hossein Rahmani*
[pdf]
[DOI]

Computing the Lipschitz constant needed for fast scene recovery from CASSI measurements
Niels Chr Overgaard*, Anders Holst
[pdf]
[DOI]

DatasetNeRF: Efficient 3D-aware Data Factory with Generative Radiance Fields
Yu Chi*, Fangneng Zhan, Sibo Wu, Christian Theobalt, Adam Kortylewski
[pdf]
[DOI]

Flowed Time of Flight Radiance Fields
Mikhail Okunev*, Marc Mapeke, Benjamin Attal, Christian Richardt, Matthew O'Toole, James Tompkin
[pdf]
[DOI]

3D-GOI: 3D GAN Omni-Inversion for Multifaceted and Multi-object Editing
Haoran Li, Long Ma, Haolin Shi, Yanbin Hao, Yong Liao*, Lechao Cheng, Peng Yuan Zhou*
[pdf]
[DOI]

Fast Registration of Photorealistic Avatars for VR Facial Animation
Chaitanya Patel*, Shaojie Bai, Te-Li Wang, Jason Saragih, Shih-En Wei
[pdf]
[DOI]

CoPT: Unsupervised Domain Adaptive Segmentation using Domain-Agnostic Text Embeddings
Cristina Mata*, Kanchana N Ranasinghe, Michael S Ryoo
[pdf]
[DOI]

HiFi-Score: Fine-grained Image Description Evaluation with Hierarchical Parsing Graphs
Ziwei Yao, Ruiping Wang*, Xilin Chen
[pdf]
[DOI]

Image-to-Lidar Relational Distillation for Autonomous Driving Data
Anas Mahmoud*, Ali Harakeh, Steven Waslander
[pdf]
[DOI]

Thinking Outside the BBox: Unconstrained Generative Object Compositing
Gemma Canet Tarrés*, Zhe Lin, Zhifei Zhang, Jianming Zhang, Yizhi Song, Dan Ruta, Andrew Gilbert, John Collomosse, Soo Ye Kim
[pdf]
[DOI]

Large-scale Reinforcement Learning for Diffusion Models
Yinan Zhang*, Eric Tzeng, Yilun Du, Dmitry Kislyuk*
[pdf]
[DOI]

CoMusion: Towards Consistent Stochastic Human Motion Prediction via Motion Diffusion
Jiarui Sun*, Girish Chowdhary*
[pdf]
[DOI]

FedHARM: Harmonizing Model Architectural Diversity in Federated Learning
Anestis Kastellos*, Athanasios Psaltis, Charalampos Z Patrikakis, Petros Daras
[pdf]
[DOI]

EAGLES: Efficient Accelerated 3D Gaussians with Lightweight EncodingS
Sharath Girish*, Kamal Gupta, Abhinav Shrivastava
[pdf]
[DOI]

Global Counterfactual Directions
Bartłomiej Sobieski*, Przemyslaw Biecek*
[pdf]
[DOI]

TCLC-GS: Tightly Coupled LiDAR-Camera Gaussian Splatting for Autonomous Driving
Cheng Zhao*, su sun, Ruoyu Wang, Yuliang Guo, Jun-Jun Wan, Zhou Huang, Xinyu Huang, Yingjie Victor Chen, Liu Ren
[pdf]
[DOI]

RT-Pose: A 4D Radar-Tensor based 3D Human Pose Estimation and Localization Benchmark
Yuan-Hao Ho, Jen-Hao Cheng, Sheng Yao Kuan, Zhongyu Jiang, Wenhao Chai, Hsiang-Wei Huang, Chih-Lung Lin, Jenq-Neng Hwang*
[pdf]
[DOI]

EditShield: Protecting Unauthorized Image Editing by Instruction-guided Diffusion Models
Ruoxi Chen, Haibo Jin, Yixin Liu, Jinyin Chen*, Haohan Wang, Lichao Sun
[pdf]
[DOI]

"RICA^2: Rubric-Informed, Calibrated Assessment of Actions"
Abrar Majeedi, Viswanatha Reddy Gajjala, Satya Sai Srinath Namburi GNVV, Yin Li*
[pdf]
[DOI]

Region-centric Image-Language Pretraining for Open-Vocabulary Detection
Dahun Kim*, Anelia Angelova, Weicheng Kuo
[pdf]
[DOI]

Commonly Interesting Images
Fitim Abdullahu*, Helmut Grabner*
[pdf]
[DOI]

Contrasting Deepfakes Diffusion via Contrastive Learning and Global-Local Similarities
Lorenzo Baraldi*, Federico Cocchi, Marcella Cornia, Lorenzo Baraldi, Alessandro Nicolosi, Rita Cucchiara
[pdf]
[DOI]

CriSp: Leveraging Tread Depth Maps for Enhanced Crime-Scene Shoeprint Matching
Samia Shafique*, Shu Kong, Charless Fowlkes
[pdf]
[DOI]

Caltech Aerial RGB-Thermal Dataset in the Wild
Connor Lee*, Matthew Anderson, Nikhil Ranganathan, Xingxing Zuo, Kevin T Do, Georgia Gkioxari, Soon-Jo Chung
[pdf]
[DOI]

Diffusion Soup: Model Merging for Text-to-Image Diffusion Models
Benjamin J Biggs*, Arjun Seshadri, Yang Zou, Achin Jain, Aditya Golatkar, Yusheng Xie, Alessandro Achille, Ashwin Swaminathan, Stefano Soatto
[pdf]
[DOI]

Volumetric Rendering with Baked Quadrature Fields
Gopal Sharma*, Daniel Rebain, Kwang Moo Yi, Andrea Tagliasacchi
[pdf]
[DOI]

CityGuessr: City-Level Video Geo-Localization on a Global Scale
Parth Parag Kulkarni*, Gaurav Kumar Nayak, Mubarak Shah
[pdf]
[DOI]

Pseudo-Labelling Should Be Aware of Disguising Channel Activations
Changrui Chen, Kurt Debattista, Jungong Han*
[pdf]
[DOI]

Bayesian Detector Combination for Object Detection with Crowdsourced Annotations
Zhi Qin Tan*, Olga Isupova, Gustavo Carneiro, Xiatian Zhu, Yunpeng Li
[pdf]
[DOI]

Revising Densification in Gaussian Splatting
Samuel Rota Bulò*, Lorenzo Porzi, Peter Kontschieder
[pdf]
[DOI]

FlexiEdit: Frequency-Aware Latent Refinement for Enhanced Non-Rigid Editing
Gwanhyeong Koo, Sunjae Yoon, Ji Woo Hong, Chang D. Yoo*
[pdf]
[DOI]

"Smoothness, Synthesis, and Sampling: Re-thinking Unsupervised Multi-View Stereo with DIV Loss"
Alex Rich*, Noah Stier, Pradeep Sen, Tobias Hollerer
[pdf]
[DOI]

Text Motion Translator: A Bi-Directional Model for Enhanced 3D Human Motion Generation from Open-Vocabulary Descriptions
Yijun Qian*, Jack Urbanek, Alexander Hauptmann, Jungdam Won
[pdf]
[DOI]

UL-VIO: Ultra-lightweight Visual-Inertial Odometry with Noise Robust Test-time Adaptation
Jinho Park*, Se Young Chun, Mingoo Seok
[pdf]
[DOI]

PolyOculus: Simultaneous Multi-view Image-based Novel View Synthesis
Jason J. Yu*, Tristan Aumentado-Armstrong, Fereshteh Forghani, Konstantinos G. Derpanis, Marcus A. Brubaker
[pdf]
[DOI]

R3DS: Reality-linked 3D Scenes for Panoramic Scene Understanding
Qirui Wu*, Sonia Raychaudhuri, Daniel Ritchie, Manolis Savva, Angel X Chang
[pdf]
[DOI]

A Graph-Based Approach for Category-Agnostic Pose Estimation
Or Hirschorn*, Shai Avidan
[pdf]
[DOI]

Depth-guided NeRF Training via Earth Mover’s Distance
Anita Rau*, Josiah Aklilu, Floyd C Holsinger, Serena Yeung-Levy
[pdf]
[DOI]

INTRA: Interaction Relationship-aware Weakly Supervised Affordance Grounding
Ji Ha Jang, Hoigi Seo, Se Young Chun*
[pdf]
[DOI]

DEPICT: Diffusion-Enabled Permutation Importance for Image Classification Tasks
Sarah Jabbour*, Gregory Kondas, Ella Kazerooni, Michael Sjoding, David Fouhey, Jenna Wiens
[pdf]
[DOI]

Meerkat: Audio-Visual Large Language Model for Grounding in Space and Time
Sanjoy Chowdhury*, Sayan Nag, Subhrajyoti Dasgupta, Jun Chen, Mohamed Elhoseiny, Ruohan Gao, Dinesh Manocha
[pdf]
[DOI]

Diagnosing and Re-learning for Balanced Multimodal Learning
Yake Wei, Siwei Li, Ruoxuan Feng, Di Hu*
[pdf]
[DOI]

Contribution-based Low-Rank Adaptation with Pre-training Model for Real Image Restoration
Dongwon Park, Hayeon Kim, Se Young Chun*
[pdf]
[DOI]

Elucidating the Hierarchical Nature of Behavior with Masked Autoencoders
Lucas Stoffl, Andy Bonnetto, Stéphane D'Ascoli, Alexander Mathis*
[pdf]
[DOI]

BeyondScene: Higher-Resolution Human-Centric Scene Generation With Pretrained Diffusion
Gwanghyun Kim, Hayeon Kim, Hoigi Seo, Dong Un Kang, Se Young Chun*
[pdf]
[DOI]

SpaRP: Fast 3D Object Reconstruction and Pose Estimation from Sparse Views
Chao Xu, Ang Li, Linghao Chen, Yulin Liu, Ruoxi Shi, Hao Su*, Minghua Liu*
[pdf]
[DOI]

MMEarth: Exploring Multi-Modal Pretext Tasks For Geospatial Representation Learning
Vishal Nedungadi, Ankit Kariryaa, Stefan Oehmcke, Serge Belongie, Christian Igel, Nico Lang*
[pdf]
[DOI]

Discovering Unwritten Visual Classifiers with Large Language Models
Mia Chiquier*, Utkarsh Mall, Carl Vondrick
[pdf]
[DOI]

LITA: Language Instructed Temporal-Localization Assistant
De-An Huang*, Shijia Liao, Subhashree Radhakrishnan, Hongxu Yin, Pavlo Molchanov, Zhiding Yu, Jan Kautz
[pdf]
[DOI]

MARs: Multi-view Attention Regularizations for Patch-based Feature Recognition of Space Terrain
Timothy Chase Jr*, Karthik Dantu
[pdf]
[DOI]

Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs
Keen You*, Haotian Zhang, Eldon Schoop, Floris Weers, Amanda Swearngin, Jeff Nichols, Yinfei Yang, Zhe Gan
[pdf]
[DOI]

Bridging the Pathology Domain Gap: Efficiently Adapting CLIP for Pathology Image Analysis with Limited Labeled Data
Zhengfeng Lai*, Joohi Chauhan, Brittany N. Dugger, Chen-Nee Chuah
[pdf]
[DOI]

AugUndo: Scaling Up Augmentations for Monocular Depth Completion and Estimation
Yangchao Wu*, Tian Yu Liu, Hyoungseob Park, Stefano Soatto, Dong Lao, Alex Wong
[pdf]
[DOI]

CARB-Net: Camera-Assisted Radar-Based Network for Vulnerable Road User Detection
Wei-Yu Lee*, Martin Dimitrievski, David Van Hamme, Jan Aelterman, Ljubomir Jovanov, Wilfried Philips
[pdf]
[DOI]

SAH-SCI: Self-Supervised Adapter for Efficient Hyperspectral Snapshot Compressive Imaging
Haijin Zeng, Yuxi Liu, Yongyong Chen*, Youfa Liu, Chong Peng, Jingyong Su
[pdf]
[DOI]

Minimalist Vision with Freeform Pixels
Jeremy Klotz*, Shree Nayar
[pdf]
[DOI]

All You Need is Your Voice: Emotional Face Representation with Audio Perspective for Emotional Talking Face Generation
Seongho Kim, Byung Cheol Song*
[pdf]
[DOI]

LatentEditor: Text Driven Local Editing of 3D Scenes
Umar Khalid*, Hasan Iqbal, Muhammad Tayyab, Md Nazmul Karim, Jing Hua, Chen Chen
[pdf]
[DOI]

Single-Photon 3D Imaging with Equi-Depth Photon Histograms
Kaustubh Sadekar*, David Maier, Atul Ingle
[pdf]
[DOI]

Asynchronous Bioplausible Neuron for Spiking Neural Networks for Event-Based Vision
Hussain Sajwani, Dimitrios Makris, Yahya Prof. Zweiri, Fariborz Baghaei Naeini, Sanket Mr Kachole*
[pdf]
[DOI]

Viewpoint textual inversion: discovering scene representations and 3D view control in 2D diffusion models
James Burgess*, Kuan-Chieh Wang, Serena Yeung-Levy
[pdf]
[DOI]

POET: Prompt Offset Tuning for Continual Human Action Adaptation
Prachi Garg*, Joseph K J, Vineeth N Balasubramanian, Necati Cihan Camgoz, Chengde Wan, Kenrick Kin, Weiguang Si, Shugao Ma, Fernando de la Torre
[pdf]
[DOI]

Domain Generalization of 3D Object Detection by Density-Resampling
Shuangzhi Li, Lei Ma, Xingyu Li*
[pdf]
[DOI]

IG Captioner: Information Gain Captioners are Strong Zero-shot Classifiers
Chenglin Yang*, Siyuan Qiao, Yuan Cao, Yu Zhang, Tao Zhu, Alan Yuille, Jiahui Yu
[pdf]
[DOI]

MRSP: Learn Multi-Representations of Single Primitive for Compositional Zero-Shot Learning
Dongyao Jiang, Hui Chen, Haodong Jing, Yongqiang Ma, Nanning Zheng*
[pdf]
[DOI]

Cross-Domain Semantic Segmentation on Inconsistent Taxonomy using VLMs
Jeongkee Lim, Yusung Kim*
[pdf]
[DOI]

TrafficNight : An Aerial Multimodal Benchmark For Nighttime Vehicle Surveillance
Guoxing Zhang, Yiming Liu, xiaoyu yang, Chao Huang*, HUANG Hailong
[pdf]
[DOI]

Loc3Diff: Local Diffusion for 3D Human Head Synthesis and Editing
Yushi Lan*, Feitong Tan, Qiangeng Xu, Di Qiu, Kyle Genova, Zeng Huang, Rohit Pandey, Sean Fanello, Thomas Funkhouser, Chen Change Loy, Yinda Zhang*
[pdf]
[DOI]

Towards Open Domain Text-Driven Synthesis of Multi-Person Motions
Mengyi Shan, Lu Dong, Yutao Han, Yuan Yao, Tao Liu, Ifeoma Nwogu, Guo-Jun Qi, Mitchell K Hill*
[pdf]
[DOI]

Generative End-to-End Autonomous Driving
Wenzhao Zheng, Ruiqi Song, Xianda Guo*, Chenming Zhang, Long Chen
[pdf]
[DOI]

Learning to Distinguish Samples for Generalized Category Discovery
Fengxiang Yang, Nan Pu, Wenjing Li, Zhiming Luo*, Shaozi Li, Nicu Sebe, Zhun Zhong*
[pdf]
[DOI]

COM Kitchens: An Unedited Overhead-view Procedural Videos Dataset a Vision-Language Benchmark
Atsushi Hashimoto*, Koki Maeda, Tosho Hirasawa, Jun Harashima, Leszek Rybicki, Yusuke Fukasawa, Yoshitaka Ushiku
[pdf]
[DOI]

PILoRA: Prototype Guided Incremental LoRA for Federated Class-Incremental Learning
Haiyang Guo*, Fei Zhu, Wenzhuo Liu, Xu-Yao Zhang*, Cheng-Lin Liu
[pdf]
[DOI]

Diff-Reg: Diffusion Model in Doubly Stochastic Matrix Space for Registration Problem
Qianliang Wu*, Haobo Jiang*, Lei Luo, Jun Li, Yaqing Ding*, Jin Xie*, Jian Yang*
[pdf]
[DOI]

WBP: Training-time Backdoor Attacks through Hardware-based Weight Bit Poisoning
Kunbei Cai*, Zhenkai Zhang, Qian Lou, Fan Yao*
[pdf]
[DOI]

"Towards Dual Transparent Liquid Level Estimation in Biomedical Lab: Dataset, Methods and Practice"
Xiayu Wang, Ke Ma, Ruiyun Zhong, Xinggang Wang, Yi Fang, Yang Xiao, Tian Xia*
[pdf]
[DOI]

Encapsulating Knowledge in One Prompt
Qi Li*, Runpeng Yu*, Xinchao Wang*
[pdf]
[DOI]

Cross-Input Certified Training for Universal Perturbations
Changming Xu*, Gagandeep Singh
[pdf]
[DOI]

Visual Relationship Transformation
Xiaoyu Xu*, Jiayan Qiu, Baosheng Yu, Zhou Wang
[pdf]
[DOI]

"Not Just Change the Labels, Learn the Features: Watermarking Deep Neural Networks with Multi-View Data"
Yuxuan Li, Sarthak Kumar Maharana, Yunhui Guo*
[pdf]
[DOI]

Delving into Adversarial Robustness on Document Tampering Localization
Huiru Shao, Zhuang Qian, Kaizhu Huang, Wei Wang, Xiaowei Huang, Qiufeng Wang*
[pdf]
[DOI]

Adaptive Selection of Sampling-Reconstruction in Fourier Compressed Sensing
Seongmin Hong, Jaehyeok Bae, Jongho Lee*, Se Young Chun*
[pdf]
[DOI]

Confidence-Based Iterative Generation for Real-World Image Super-Resolution
Jialun Peng, Xin Luo, Jingjing Fu*, Dong Liu*
[pdf]
[DOI]

Learning Scalable Model Soup on a Single GPU: An Efficient Subspace Training Strategy
Tao Li*, Weisen Jiang, Fanghui Liu, Xiaolin Huang, James Kwok
[pdf]
[DOI]

Correspondences of the Third Kind: Camera Pose Estimation from Object Reflection
Kohei Yamashita*, Vincent Lepetit, Ko Nishino
[pdf]
[DOI]

Seeing Faces in Things: A Model and Dataset for Pareidolia
Mark T Hamilton*, Simon Stent, Vasha G DuTell, Anne Harrington, Jennifer E Corbett, Ruth Rosenholtz, William T. Freeman
[pdf]
[DOI]

Cocktail Universal Adversarial Attack on Deep Neural Networks
Shaoxin Li*, Xiaofeng Liao, Xin Che, Xintong Li, Yong Zhang, Lingyang Chu*
[pdf]
[DOI]

Gaussian Frosting: Editable Complex Radiance Fields with Real-Time Rendering
Antoine Guédon*, Vincent Lepetit
[pdf]
[DOI]

AMD: Automatic Multi-step Distillation of Large-scale Vision Models
Cheng Han, Qifan Wang, Sohail A Dianat, Majid Rabbani, Raghuveer Rao, Yi Fang, Qiang Guan, Lifu Huang, Dongfang Liu*
[pdf]
[DOI]

FairViT: Fair Vision Transformer via Adaptive Masking
Bowei Tian, Ruijie Du, Yanning Shen*
[pdf]
[DOI]

TrojVLM: Backdoor Attack Against Vision Language Models
Weimin Lyu*, Lu Pang, Tengfei Ma, Haibin Ling, Chao Chen
[pdf]
[DOI]

VisionLLaMA: A Unified LLaMA Backbone for Vision Tasks
Xiangxiang Chu*, Jianlin Su, Bo Zhang*, Chunhua Shen
[pdf]
[DOI]

Frugal 3D Point Cloud Model Training via Progressive Near Point Filtering and Fused Aggregation
Donghyun Lee, Yejin Lee, Jae W. Lee*, Hongil Yoon*
[pdf]
[DOI]

HVCLIP: High-dimensional Vector in CLIP for Unsupervised Domain Adaptation
Noranart Vesdapunt*, Kah Kuen Fu, Yue Wu, Xu Zhang, Pradeep Natarajan
[pdf]
[DOI]

Improving 3D Semi-supervised Learning by Effectively Utilizing All Unlabelled Data
Sneha Paul*, Zachary Patterson, Nizar Bouguila
[pdf]
[DOI]

PRET: Planning with Directed Fidelity Trajectory for Vision and Language Navigation
Renjie Lu, Jingke Meng*, WEI-SHI ZHENG
[pdf]
[DOI]

MART: MultiscAle Relational Transformer Networks for Multi-agent Trajectory Prediction
Seongju Lee, Junseok Lee, Yeonguk Yu, Taeri Kim, Kyoobin Lee*
[pdf]
[DOI]

Expanding Scene Graph Boundaries: Fully Open-vocabulary Scene Graph Generation via Visual-Concept Alignment and Retention
Zuyao Chen, Jinlin Wu, Zhen Lei, Zhaoxiang Zhang, Chang Wen Chen*
[pdf]
[DOI]

Few-shot NeRF by Adaptive Rendering Loss Regularization
Qingshan Xu*, Xuanyu Yi, Jianyao Xu, Wenbing Tao, Yew Soon Ong, Hanwang Zhang
[pdf]
[DOI]

Investigating Style Similarity in Diffusion Models
Gowthami Somepalli*, Anubhav Gupta, Kamal Gupta, Shramay Palta, Micah Goldblum, Jonas A. Geiping, Abhinav Shrivastava, Tom Goldstein
[pdf]
[DOI]

JDT3D: Addressing the Gaps in LiDAR-Based Tracking-by-Attention
Brian Cheong*, Jiachen Zhou*, Steven L Waslander*
[pdf]
[DOI]

MagicMirror: Fast and High-Quality Avatar Generation with Constrained Search Space
Armand Comas, Di Qiu*, Menglei Chai, Marcel C. Bühler, Amit Raj, Ruiqi Gao, Qiangeng Xu, Mark J Matthews, Paulo Gotardo, Sergio Orts-Escolano, Thabo Beeler
[pdf]
[DOI]

EntAugment: Entropy-Driven Adaptive Data Augmentation Framework for Image Classification
Suorong Yang*, Furao Shen*, Jian Zhao
[pdf]
[DOI]

Timestep-Aware Correction for Quantized Diffusion Models
Yuzhe Yao, Feng Tian, Jun Chen*, Haonan Lin, Guang Dai, Yong Liu, Jingdong Wang
[pdf]
[DOI]

SPARO: Selective Attention for Robust and Compositional Transformer Encodings for Vision
Ankit Vani*, Bac Nguyen, Samuel Lavoie, Ranjay Krishna, Aaron Courville
[pdf]
[DOI]

Towards compact reversible image representations for neural style transfer
Xiyao Liu, Siyu Yang, Jian Zhang*, Gerald Schaefer, Jiya Li, Xunli FAN, Songtao Wu, Hui Fang*
[pdf]
[DOI]

Out-of-Bounding-Box Triggers: A Stealthy Approach to Cheat Object Detectors
Tao Lin*, lijia Yu*, Gaojie Jin*, Renjue Li*, Peng Wu*, Lijun Zhang*
[pdf]
[DOI]

GTMS: A Gradient-driven Tree-guided Mask-free Referring Image Segmentation Method
Haoxin Lv, Tianxiong Zhong, Sanyuan Zhao*
[pdf]
[DOI]

Long-term Temporal Context Gathering for Neural Video Compression
Linfeng Qi, Zhaoyang Jia, Jiahao Li, Bin Li, Houqiang Li, Yan Lu*
[pdf]
[DOI]

VQA-Diff: Exploiting VQA and Diffusion for Zero-Shot Image-to-3D Vehicle Asset Generation in Autonomous Driving
YIBO LIU*, Zheyuan Yang, Guile Wu, Yuan Ren, Kejian Lin, Liu Bingbing, Yang Liu, JINJUN SHAN
[pdf]
[DOI]

From Pixels to Objects: A Hierarchical Approach for Part and Object Segmentation Using Local and Global Aggregation
Yunfei Xie*, Cihang Xie, Alan Yuille, Jieru Mei
[pdf]
[DOI]

Leveraging Text Localization for Scene Text Removal via Text-aware Masked Image Modeling
Zixiao Wang*, Hongtao Xie, YuXin Wang, Yadong Qu, Fengjun Guo, Pengwei Liu
[pdf]
[DOI]

Unmasking Bias in Diffusion Model Training
Hu Yu, Li Shen, Jie Huang, Hongsheng Li, Feng Zhao*
[pdf]
[DOI]

Multimodal Label Relevance Ranking via Reinforcement Learning
Taian Guo, Taolin Zhang, Haoqian Wu, Hanjun Li, Ruizhi Qiao*, Xing Sun
[pdf]
[DOI]

Animate Your Motion: Turning Still Images into Dynamic Videos
Mingxiao Li*, Bo Wan*, Sien Moens, Tinne Tuytelaars
[pdf]
[DOI]

Layered Rendering Diffusion Model for Controllable Zero-Shot Image Synthesis
Zipeng Qi, Guoxi Huang*, Chenyang Liu, Fei Ye
[pdf]
[DOI]

CIC-BART-SSA: : Controllable Image Captioning with Structured Semantic Augmentation
Kalliopi Basioti*, Mohamed A Abdelsalam*, Federico Fancellu*, Vladimir Pavlovic*, Afsaneh Fazly*
[pdf]
[DOI]

A Simple Background Augmentation Method for Object Detection with Diffusion Model
Yuhang Li, Xin Dong, Chen Chen, Weiming Zhuang, Lingjuan Lyu*
[pdf]
[DOI]

Echoes of the Past: Boosting Long-tail Recognition via Reflective Learning
Qihao Zhao, Yalun Dai, Shen Lin, Wei Hu, Fan Zhang*, Jun Liu
[pdf]
[DOI]

"BlinkVision: A Benchmark for Optical Flow, Scene Flow and Point Tracking Estimation using RGB Frames and Events"
Yijin Li, Yichen Shen, Zhaoyang Huang, Shuo Chen, Weikang Bian, Xiaoyu Shi, Fu-Yun Wang, Keqiang Sun, Hujun Bao, Zhaopeng Cui, Guofeng Zhang*, Hongsheng Li*
[pdf]
[DOI]

A Unified Anomaly Synthesis Strategy with Gradient Ascent for Industrial Anomaly Detection and Localization
Qiyu Chen, Huiyuan Luo, Chengkan Lv*, Zhengtao Zhang
[pdf]
[DOI]

Deep Polarization Cues for Single-shot Shape and Subsurface Scattering Estimation
Chenhao Li*, Trung Thanh Ngo, Hajime Nagahara
[pdf]
[DOI]

Rethinking Features-Fused-Pyramid-Neck for Object Detection
Hulin Li*
[pdf]
[DOI]

Spatial-Temporal Multi-level Association for Video Object Segmentation
Deshui Miao, Xin Li, Zhenyu He*, Huchuan Lu, Ming-Hsuan Yang
[pdf]
[DOI]

Sparse Refinement for Efficient High-Resolution Semantic Segmentation
Zhijian Liu, Zhuoyang Zhang, Samir Khaki, Shang Yang, Haotian Tang, Chenfeng Xu, Kurt Keutzer, Song Han*
[pdf]
[DOI]

Safeguard Text-to-Image Diffusion Models with Human Feedback Inversion
Sanghyun Kim*, Seohyeon Jung, Balhae Kim, Moonseok Choi, Jinwoo Shin, Juho Lee*
[pdf]
[DOI]

An Explainable Vision Question Answer Model via Diffusion Chain-of-Thought
Chunhao LU, Qiang Lu*, Jake Luo
[pdf]
[DOI]

RaFE: Generative Radiance Fields Restoration
Zhongkai Wu, Ziyu Wan, Jing Zhang*, Jing Liao, Dong Xu
[pdf]
[DOI]

UniProcessor: A Text-induced Unified Low-level Image Processor
Huiyu Duan*, Xiongkuo Min, Sijing Wu, Wei Shen, Guangtao Zhai
[pdf]
[DOI]

Fast Sprite Decomposition from Animated Graphics
Tomoyuki Suzuki*, Kotaro Kikuchi, Kota Yamaguchi
[pdf]
[DOI]

Learning Unified Reference Representation for Unsupervised Multi-class Anomaly Detection
Liren He, Zhengkai Jiang, Jinlong Peng, Wenbing Zhu, Liang Liu, Qiangang Du, Xiaobin Hu, Mingmin Chi*, Yabiao Wang*, Chengjie Wang*
[pdf]
[DOI]

IRSAM: Advancing Segment Anything Model for Infrared Small Target Detection
Mingjin Zhang, Yuchun Wang*, Jie Guo*, Yunsong Li, Xinbo Gao, Jing Zhang
[pdf]
[DOI]

PatchRefiner: Leveraging Synthetic Data for Real-Domain High-Resolution Monocular Metric Depth Estimation
Zhenyu Li*, Shariq Farooq Bhat, Peter Wonka
[pdf]
[DOI]

A Geometric Distortion Immunized Deep Watermarking Framework with Robustness Generalizability
Linfeng Ma, Han Fang*, Tianyi Wei, Zijin Yang, Zehua Ma*, Weiming Zhang, Nenghai Yu
[pdf]
[DOI]

Towards Robust Event-based Networks for Nighttime via Unpaired Day-to-Night Event Translation
Yuhwan Jeong, Hoonhee Cho, Kuk-Jin Yoon*
[pdf]
[DOI]

CLAMP-ViT: Contrastive Data-Free Learning for Adaptive Post-Training Quantization of ViTs
Akshat Ramachandran*, Souvik Kundu*, Tushar Krishna*
[pdf]
[DOI]

A Riemannian Approach for Spatiotemporal Analysis and Generation of 4D Tree-shaped Structures
Tahmina Khanam, Mohammed Bennamoun, Guan Wang, Guanjin Wang, Ferdous Sohel, Farid Boussaid, Anuj Srivastava, Hamid Laga*
[pdf]
[DOI]

Dual-Path Adversarial Lifting for Domain Shift Correction in Online Test-time Adaptation
Yushun Tang, Shuoshuo Chen, Zhihe Lu, Xinchao Wang, Zhihai He*
[pdf]
[DOI]

Data Overfitting for On-Device Super-Resolution with Dynamic Algorithm and Compiler Co-Design
Gen Li*, zhihao shu, Jie Ji, Minghai Qin, Fatemeh Afghah, Wei Niu, Xiaolong Ma*
[pdf]
[DOI]

The Role of Masking for Efficient Supervised Knowledge Distillation of Vision Transformers
Seungwoo Son*, Jegwang Ryu, Namhoon Lee, Jaeho Lee*
[pdf]
[DOI]

Training A Small Emotional Vision Language Model for Visual Art Comprehension
Jing Zhang, Liang Zheng*, Meng Wang, Dan Guo*
[pdf]
[DOI]

UGG: Unified Generative Grasping
Jiaxin Lu, Hao Kang, Haoxiang Li, Bo Liu, Yiding Yang, Qixing Huang, Gang Hua*
[pdf]
[DOI]

FrePolad: Frequency-Rectified Point Latent Diffusion for Point Cloud Generation
Chenliang Zhou*, Fangcheng Zhong, Param Hanji, Zhilin Guo, Kyle Thomas Fogarty, Alejandro Sztrajman, Hongyun Gao, A. Cengiz Oztireli
[pdf]
[DOI]

Learning to Detect Multi-class Anomalies with Just One Normal Image Prompt
Bin-Bin Gao*
[pdf]
[DOI]

GAMMA-FACE: GAussian Mixture Models Amend Diffusion Models for Bias Mitigation in Face Images
Basudha Pal*, Arunkumar Kannan*, Ram Prabhakar Kathirvel, Alice O'Toole, Rama Chellappa
[pdf]
[DOI]

Reinforcement Learning Friendly Vision-Language Model for Minecraft
Haobin Jiang, Junpeng Yue, Hao Luo, Ziluo Ding, Zongqing Lu*
[pdf]
[DOI]

Pseudo-RIS: Distinctive Pseudo-supervision Generation for Referring Image Segmentation
Seonghoon Yu, Paul Hongsuck Seo*, Jeany Son*
[pdf]
[DOI]

Training-free Composite Scene Generation for Layout-to-Image Synthesis
Jiaqi Liu*, Tao Huang, Chang Xu
[pdf]
[DOI]

Robustness Preserving Fine-tuning using Neuron Importance
Guangrui Li, Rahul Duggal*, Aaditya Singh, Kaustav Kundu, Bing Shuai, Jonathan Wu
[pdf]
[DOI]

ProxyCLIP: Proxy Attention Improves CLIP for Open-Vocabulary Segmentation
Mengcheng Lan, Chaofeng Chen, Yiping Ke, Xinjiang Wang, Litong Feng, Wayne Zhang*
[pdf]
[DOI]

PEA-Diffusion: Parameter-Efficient Adapter with Knowledge Distillation in non-English Text-to-Image Generation
jian ma, Chen Chen*, Qingsong Xie, Haonan Lu*
[pdf]
[DOI]

Similarity of Neural Architectures using Adversarial Attack Transferability
Jaehui Hwang, Dongyoon Han, Byeongho Heo, Song Park, Sanghyuk Chun*, Jong-Seok Lee
[pdf]
[DOI]

Dual-Rain: Video Rain Removal using Assertive and Gentle Teachers
Tingting Chen*, Beibei Lin, Yeying Jin, Wending Yan, WEI YE, Yuan Yuan, Robby T. Tan
[pdf]
[DOI]

PMT: Progressive Mean Teacher via Exploring Temporal Consistency for Semi-Supervised Medical Image Segmentation
Ning Gao, Sanping Zhou*, Le Wang, Nanning Zheng
[pdf]
[DOI]

OmniACT: A Dataset and Benchmark for Enabling Multimodal Generalist Autonomous Agents for Desktop and Web
Raghav Kapoor*, Yash Parag Butala*, Melisa A Russak, Jing Yu Koh, Kiran Kamble, Waseem AlShikh, Ruslan Salakhutdinov
[pdf]
[DOI]

AutoEval-Video: An Automatic Benchmark for Assessing Large Vision Language Models in Open-Ended Video Question Answering
Xiuyuan Chen, Yuan Lin*, Yuchen Zhang*, Weiran Huang*
[pdf]
[DOI]

Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models
Jinrui Zhang, Teng Wang, Haigang Zhang, Ping Lu, Feng Zheng*
[pdf]
[DOI]

Unsupervised Variational Translator for Bridging Image Restoration and High-Level Vision Tasks
Jiawei Wu, Zhi Jin*
[pdf]
[DOI]

Diffusion Model for Robust Multi-Sensor Fusion in 3D Object Detection and BEV Segmentation
Duy Tho Le*, Hengcan Shi*, Jianfei Cai, Hamid Rezatofighi
[pdf]
[DOI]

MeshAvatar: Learning High-quality Triangular Human Avatars from Multi-view Videos
Yushuo Chen*, Zerong Zheng, Zhe Li, Chao Xu, Yebin Liu
[pdf]
[DOI]

Fast Point Cloud Geometry Compression with Context-based Residual Coding and INR-based Refinement
Hao Xu*, Xi Zhang, Xiaolin Wu*
[pdf]
[DOI]

Scene-Conditional 3D Object Stylization and Composition
Jinghao Zhou*, Tomas Jakab, Philip Torr, Christian Rupprecht
[pdf]
[DOI]

GenView: Enhancing View Quality with Pretrained Generative Model for Self-Supervised Learning
Xiaojie Li, Yibo Yang*, Xiangtai Li, Jianlong Wu*, Yue Yu, Bernard Ghanem, Min Zhang
[pdf]
[DOI]

Revisit Anything: Visual Place Recognition via Image Segment Retrieval
Kartik Garg, Sai Shubodh, Shishir N Y Kolathaya, Madhava Krishna, Sourav Garg*
[pdf]
[DOI]

EcoMatcher: Efficient Clustering Oriented Matcher for Detector-free Image Matching
Peiqi Chen*, Lei Yu, Yi Wan*, Yongjun Zhang*, Jian Wang, Liheng Zhong, Jingdong Chen, Ming Yang
[pdf]
[DOI]

DGD: Dynamic 3D Gaussians Distillation
Isaac Labe, Noam Issachar, Itai Lang, Sagie Benaim*
[pdf]
[DOI]

Semantic Diversity-aware Prototype-based Learning for Unbiased Scene Graph Generation
Jaehyeong Jeon*, Kibum Kim, Kanghoon Yoon, Chanyoung Park
[pdf]
[DOI]

DiffuMatting: Synthesizing Arbitrary Objects with Matting-level Annotation
Xiaobin Hu, Xu Peng, Donghao Luo*, Xiaozhong Ji, Jinlong Peng, ZhengKai Jiang, Jiangning Zhang, Taisong Jin*, Chengjie Wang, Rongrong Ji
[pdf]
[DOI]

Self-Guided Generation of Minority Samples Using Diffusion Models
Soobin Um, Jong Chul Ye*
[pdf]
[DOI]

DEVIAS: Learning Disentangled Video Representations of Action and Scene
Kyungho Bae, Youngrae Kim, Geo Ahn, Jinwoo Choi*
[pdf]
[DOI]

AD3: Introducing a score for Anomaly Detection Dataset Difficulty assessment using VIADUCT dataset
Jan D Lehr*, Jan H Philipps, Alik Sargsyan, Martin Pape, Jörg Krüger
[pdf]
[DOI]

RoomTex: Texturing Compositional Indoor Scenes via Iterative Inpainting
Qi WANG*, Ruijie Lu, Xudong XU, Jingbo Wang, Michael Yu Wang, Bo Dai, Gang Zeng, Dan Xu
[pdf]
[DOI]

Class-Agnostic Object Counting with Text-to-Image Diffusion Model
Xiaofei Hui, Qian Wu, Hossein Rahmani, Jun Liu*
[pdf]
[DOI]

Mask2Map: Vectorized HD Map Construction Using Bird's Eye View Segmentation Masks
Sehwan Choi*, Jun Won Choi, Jungho Kim, Hongjae Shin
[pdf]
[DOI]

SUP-NeRF: A Streamlined Unification of Pose Estimation and NeRF for Monocular 3D Object Reconstruction
Yuliang Guo*, Abhinav Kumar, Cheng Zhao, Ruoyu Wang, Xinyu Huang, Liu Ren
[pdf]
[DOI]

Forbes: Face Obfuscation Rendering via Backpropagation Refinement Scheme
Jintae Kim, Seungwon Yang, Seong-Gyun Jeong, Chang-Su Kim*
[pdf]
[DOI]

Pyramid Diffusion for Fine 3D Large Scene Generation
Yuheng Liu*, Xinke Li, Xueting Li, Lu Qi*, Chongshou Li, Ming-Hsuan Yang
[pdf]
[DOI]

ShoeModel: Learning to Wear on the User-specified Shoes via Diffusion Model
Wenyu Li*, Binghui Chen, Yifeng Geng, Xuansong Xie, Wangmeng Zuo
[pdf]
[DOI]

A Watermark-Conditioned Diffusion Model for IP Protection
Rui Min*, Sen Li*, Hongyang Chen*, Minhao Cheng*
[pdf]
[DOI]

Finding NeMo: Negative-mined Mosaic Augmentation for Referring Image Segmentation
Seongsu Ha, Chaeyun Kim, Donghwa Kim, Junho Lee, Sangho Lee, Joonseok Lee*
[pdf]
[DOI]

SAFT: Towards Out-of-Distribution Generalization in Fine-Tuning
Bac Nguyen*, Stefan Uhlich, Fabien Cardinaux, Lukas Mauch, Marzieh Edraki, Aaron Courville
[pdf]
[DOI]

FTBC: Forward Temporal Bias Correction for Optimizing ANN-SNN Conversion
Xiaofeng Wu*, Velibor Bojkovic, Bin Gu*, Kun Suo, Kai Zou
[pdf]
[DOI]

Improving Vision and Language Concepts Understanding with Multimodal Counterfactual Samples
Chengen Lai, Shengli Song*, Sitong Yan, Guangneng Hu
[pdf]
[DOI]

Centering the Value of Every Modality: Towards Efficient and Resilient Modality-agnostic Semantic Segmentation
Xu Zheng*, Yuanhuiyi Lyu, jiazhou zhou, Lin Wang*
[pdf]
[DOI]

GTPT: Group-based Token Pruning Transformer for Efficient Human Pose Estimation
Haonan Wang, Jie Liu*, Jie Tang, Gangshan Wu, Bo Xu, Yanbing Chou, Yong Wang
[pdf]
[DOI]

Lost in Translation: Modern Neural Networks Still Struggle With Small Realistic Image Transformations
Ofir Shifman*, Yair Weiss
[pdf]
[DOI]

DIAL: Dense Image-text ALignment for Weakly Supervised Semantic Segmentation
Soojin Jang, JungMin Yun, JuneHyoung Kwon, Eunju Lee, YoungBin Kim*
[pdf]
[DOI]

Rethinking Normalization Layers for Domain Generalizable Person Re-identification
Ren Nie, Jin Ding, Xue Zhou*, Xi Li
[pdf]
[DOI]

Generalizing to Unseen Domains via Text-guided Augmentation
Daiqing Qi*, Handong Zhao, Aidong Zhang, Sheng Li
[pdf]
[DOI]

VCP-CLIP: A visual context prompting model for zero-shot anomaly segmentation
Zhen Qu, Xian Tao*, Mukesh Prasad, Fei Shen, Zhengtao Zhang, Xinyi Gong, Guiguang Ding
[pdf]
[DOI]

Lost in Translation: Latent Concept Misalignment in Text-to-Image Diffusion Models
Juntu Zhao, Junyu Deng, Yixin Ye, Chongxuan Li, Zhijie Deng*, Dequan Wang*
[pdf]
[DOI]

Crowd-SAM:SAM as a smart annotator for object detection in crowded scenes
Zhi Cai, Yingjie Gao, Yaoyan Zheng, Nan Zhou, Di Huang*
[pdf]
[DOI]

Zero-shot Text-guided Infinite Image Synthesis with LLM guidance
Soyeong Kwon, Taegyeong Lee, Taehwan Kim*
[pdf]
[DOI]

Learning Dual-Level Deformable Implicit Representation for Real-World Scale Arbitrary Super-Resolution
Zhiheng Li, Muheng Li, Jixuan Fan, Lei Chen*, Yansong Tang, Jiwen Lu, Jie Zhou
[pdf]
[DOI]

Boosting Gaze Object Prediction via Pixel-level Supervision from Vision Foundation Model
Yang Jin, Lei Zhang, Shi Yan, Bin Fan, Binglu Wang*
[pdf]
[DOI]

Pro2SAM: Mask Prompt to SAM with Grid Points for Weakly Supervised Object Localization
Xi Yang, Songsong Duan*, Nannan Wang, Xinbo Gao
[pdf]
[DOI]

Adaptive Multi-head Contrastive Learning
Lei Wang*, Piotr Koniusz, Tom Gedeon, Liang Zheng
[pdf]
[DOI]

Rotated Orthographic Projection for Self-Supervised 3D Human Pose Estimation
YAO YAO, Yixuan Pan, Wenjun Shi, Dongchen Zhu, Lei Wang, Jiamao Li*
[pdf]
[DOI]

Easing 3D Pattern Reasoning with Side-view Features for Semantic Scene Completion
Linxi Huan, Mingyue Dong, Linwei Yue, Shuhan Shen, Xianwei Zheng*
[pdf]
[DOI]

DSMix: Distortion-Induced Saliency Map Based Pre-training for No-Reference Image Quality Assessment
Jinsong Shi, Pan Gao*, Xiaojiang Peng, Jie Qin
[pdf]
[DOI]

MO-EMT-NAS: Multi-Objective Continuous Transfer of Architectural Knowledge Between Tasks from Different Datasets
PENG LIAO*, Xilu Wang*, Yaochu Jin*, Wenli Du*
[pdf]
[DOI]

Text-to-Sticker: Style Tailoring Latent Diffusion Models for Human Expression
Animesh Sinha*, Bo Sun, Anmol Kalia, Arantxa Casanova, Elliot Blanchard, David Yan, Winnie Zhang, Tony Nelli, Jiahui Chen, Hardik Shah, Licheng Yu, Mitesh Kumar Singh, Ankit Ramchandani, Maziar Sanjabi, Sonal Gupta, Amy L Bearman, Dhruv Mahajan
[pdf]
[DOI]

Adaptive Annealing for Robust Averaging
Sidhartha Chitturi*, Venu Madhav Govindu
[pdf]
[DOI]

GRIDS: Grouped Multiple-Degradation Restoration with Image Degradation Similarity
Shuo Cao, Yihao Liu, Wenlong Zhang, Yu Qiao, Chao Dong*
[pdf]
[DOI]

MaxMI: A Maximal Mutual Information Criterion for Manipulation Concept Discovery
Pei Zhou, Yanchao Yang*
[pdf]
[DOI]

High-Quality Mesh Blendshape Generation from Face Videos via Neural Inverse Rendering
Xin Ming, Jiawei Li, Jingwang Ling, Libo Zhang, Feng Xu*
[pdf]
[DOI]

Disentangling Masked Autoencoders for Unsupervised Domain Generalization
An Zhang*, Han Wang, Xiang Wang, Tat-Seng Chua
[pdf]
[DOI]

Early Anticipation of Driving Maneuvers
Abdul Wasi Lone, Shankar Gangisetty*, Shyam Nandan Rai, C. V. Jawahar
[pdf]
[DOI]

Bottom-Up Domain Prompt Tuning for Generalized Face Anti-Spoofing
Siqi Liu*, Qirui Wang, Pong C. Yuen
[pdf]
[DOI]

SG-NeRF: Neural Surface Reconstruction with Scene Graph Optimization
Yiyang Chen, Siyan Dong*, Xulong Wang, Lulu Cai, Youyi Zheng, Yanchao Yang*
[pdf]
[DOI]

On the Evaluation Consistency of Attribution-based Explanations
Jiarui Duan, Haoling Li, Haofei Zhang, Hao Jiang, Mengqi Xue, Li Sun, Mingli Song, Jie Song*
[pdf]
[DOI]

Unified Embedding Alignment for Open-Vocabulary Video Instance Segmentation
Hao Fang, Peng Wu, Yawei Li, Xinxin Zhang, Xiankai Lu*
[pdf]
[DOI]

InfoNorm: Mutual Information Shaping of Normals for Sparse-View Reconstruction
Xulong Wang, Siyan Dong*, Youyi Zheng, Yanchao Yang*
[pdf]
[DOI]

DreamReward: Aligning Human Preference in Text-to-3D Generation
Junliang Ye, Fangfu Liu, Qixiu Li, Zhengyi Wang, Yikai Wang, Xinzhou Wang, Yueqi Duan*, Jun Zhu*
[pdf]
[DOI]

Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videos
Changan Chen*, Puyuan Peng, Ami Baid, Zihui Xue, Wei-Ning Hsu, David Harwath, Kristen Grauman
[pdf]
[DOI]

Frontier-enhanced Topological Memory with Improved Exploration Awareness for Embodied Visual Navigation
Xinru Cui, Qiming Liu, Zhe Liu, Hesheng Wang*
[pdf]
[DOI]

MTMamba: Enhancing Multi-Task Dense Scene Understanding by Mamba-Based Decoders
Baijiong Lin*, Weisen Jiang, Pengguang Chen, Yu Zhang, Shu Liu, Yingcong Chen
[pdf]
[DOI]

VITATECS: A Diagnostic Dataset for Temporal Concept Understanding of Video-Language Models
Shicheng Li, Lei Li, Yi Liu, Shuhuai Ren, Yuanxin Liu, Rundong Gao, Xu Sun*, Lu Hou
[pdf]
[DOI]

Learning a Dynamic Privacy-preserving Camera Robust to Inversion Attacks
Jiacheng Cheng*, Xiang Dai, Jia Wan, Nick Antipa, Nuno Vasconcelos
[pdf]
[DOI]

CadVLM: Bridging Language and Vision in the Generation of Parametric CAD Sketches
Sifan Wu*, Amir Hosein Khasahmadi, Mor Katz, Pradeep Kumar Jayaraman, Yewen Pu, Karl D.D. Willis, Bang Liu*
[pdf]
[DOI]

Towards Image Ambient Lighting Normalization
Florin-Alexandru Vasluianu*, Tim Seizinger, Zongwei WU*, Rakesh Ranjan, Radu Timofte
[pdf]
[DOI]

FedHide: Federated Learning by Hiding in the Neighbors
Hyunsin Park*, Sungrack Yun
[pdf]
[DOI]

Toward INT4 Fixed-Point Training via Exploring Quantization Error for Gradients
Dohyung Kim, Junghyup Lee, Jeimin Jeon, JAEHYEON MOON, Bumsub Ham*
[pdf]
[DOI]

SelEx: Self-Expertise in Fine-Grained Generalized Category Discovery
Sarah Rastegar*, Mohammadreza Salehi, Yuki M Asano, Hazel Doughty, Cees Snoek
[pdf]
[DOI]

Self-Cooperation Knowledge Distillation for Novel Class Discovery
Yuzheng Wang*, Zhaoyu Chen, Dingkang Yang, Yunquan Sun, Lizhe Qi*
[pdf]
[DOI]

EventBind: Learning a Unified Representation to Bind Them All for Event-based Open-world Understanding
jiazhou zhou*, Xu Zheng, Yuanhuiyi Lyu, Lin Wang
[pdf]
[DOI]

GLAD: Towards Better Reconstruction with Global and Local Adaptive Diffusion Models for Unsupervised Anomaly Detection
Hang Yao, Ming Liu*, Zhicun Yin, Zifei Yan, Xiaopeng Hong, Wangmeng Zuo
[pdf]
[DOI]

MedRAT: Unpaired Medical Report Generation via Auxiliary Tasks
Elad Hirsch*, Gefen Dawidowicz, Ayellet Tal
[pdf]
[DOI]

Are Synthetic Data Useful for Egocentric Hand-Object Interaction Detection?
Rosario Leonardi*, Antonino Furnari, Francesco Ragusa, Giovanni Maria Farinella
[pdf]
[DOI]

"PoseEmbroider: Towards a 3D, Visual, Semantic-aware Human Pose Representation"
Ginger Delmas*, Philippe Weinzaepfel, Francesc Moreno-Noguer, Gregory Rogez
[pdf]
[DOI]

A Comparative Study of Image Restoration Networks for General Backbone Network Design
Xiangyu Chen*, Zheyuan Li, Yuandong Pu, Yihao Liu, Jiantao Zhou*, Yu Qiao, Chao Dong*
[pdf]
[DOI]

Learned Image Enhancement via Color Naming
David Serrano-Lozano*, Luis Herranz, Michael S Brown, Javier Vazquez-Corral
[pdf]
[DOI]

Synthesizing Time-varying BRDFs via Latent Space
Takuto Narumoto*, Hiroaki Santo, Fumio Okura
[pdf]
[DOI]

HoloADMM: High-Quality Holographic Complex Field Recovery
Mazen Mel*, Paul Springer, Pietro Zanuttigh, Haitao Zhou, Alexander Gatto
[pdf]
[DOI]

Fundamental Matrix Estimation Using Relative Depths
Yaqing Ding*, Václav Vávra, Snehal Bhayani, Qianliang Wu, Jian Yang, Zuzana Kukelova
[pdf]
[DOI]

Gaussian Splatting on the Move: Blur and Rolling Shutter Compensation for Natural Camera Motion
Otto Seiskari*, Jerry Ylilammi, Valtteri Kaatrasalo, Pekka Rantalankila, Matias Turkulainen, Juho Kannala, Esa Rahtu, Arno Solin
[pdf]
[DOI]

MTaDCS: Moving Trace and Feature Density-based Confidence Sample Selection under Label Noise
Qingzheng Huang, Xilin He, Xiaole Xian, Qinliang Lin, Weicheng Xie*, Siyang Song, Linlin Shen, Zitong Yu
[pdf]
[DOI]

Towards Open-World Object-based Anomaly Detection via Self-Supervised Outlier Synthesis
Brian Kostadinov Shalon Isaac-Medina*, Yona Falinie Abdul Gaus*, Neelanjan Bhowmik, Toby P Breckon
[pdf]
[DOI]

GroundUp: Rapid Sketch-Based 3D City Massing
Gizem Esra Unlu*, Mohamed Sayed, Yulia Gryaditskaya, Gabriel Brostow
[pdf]
[DOI]

Guide-and-Rescale: Self-Guidance Mechanism for Effective Tuning-Free Real Image Editing
Vadim Titov*, Madina Khalmatova*, Alexandra Ivanova*, Dmitry P Vetrov, Aibek Alanov*
[pdf]
[DOI]

DataDream: Few-shot Guided Dataset Generation
Jae Myung Kim*, Jessica Bader, Stephan Alaniz, Cordelia Schmid, Zeynep Akata
[pdf]
[DOI]

LPViT: Low-Power Semi-structured Pruning for Vision Transformers
Kaixin Xu*, Zhe Wang*, Chunyun Chen, Xue Geng, Jie Lin, Xulei Yang, Min Wu*, Xiaoli Li, Weisi Lin*
[pdf]
[DOI]

CipherDM: Secure Three-Party Inference for Diffusion Model Sampling
Xin Zhao, Xiaojun Chen*, Xudong Chen, He Li, Tingyu Fan, Zhendong Zhao
[pdf]
[DOI]

Weighted Ensemble Models Are Strong Continual Learners
Imad Eddine MAROUF*, Subhankar Roy, Enzo Tartaglione, Stéphane Lathuilière
[pdf]
[DOI]

GGRt: Towards Generalizable 3D Gaussians without Pose Priors in Real-Time
Hao Li, Yuanyuan Gao, Dingwen Zhang*, Chenming Wu, YALUN DAI, Chen Zhao, Haocheng Feng, Errui Ding, Jingdong Wang, Junwei Han
[pdf]
[DOI]

A Unified Image Compression Method for Human Perception and Multiple Vision Tasks
Sha Guo, Lin Sui, Chen-Lin Zhang, Zhuo Chen, Wenhan Yang, Lingyu Duan*
[pdf]
[DOI]

UniVoxel: Fast Inverse Rendering by Unified Voxelization of Scene Representation
Shuang Wu, Songlin Tang, Guangming Lu, Jianzhuang Liu, Wenjie Pei*
[pdf]
[DOI]

Audio-visual Generalized Zero-shot Learning the Easy Way
Shentong Mo*, Pedro Morgado
[pdf]
[DOI]

PartImageNet++ Dataset: Scaling up Part-based Models for Robust Recognition
Xiao Li*, Yining Liu, Na Dong, Sitian Qin, Xiaolin Hu
[pdf]
[DOI]

Learning Equilibrium Transformation for Gamut Expansion and Color Restoration
Jun Xiao*, Changjian Shui, Zhi-Song Liu, Qian Ye, Kin-Man Lam
[pdf]
[DOI]

Dyn-Adapter: Towards Disentangled Representation for Efficient Visual Recognition
Yurong Zhang*, Honghao Chen, Zhang Xinyu, Xiangxiang Chu, Li Song
[pdf]
[DOI]

Physics-informed Knowledge Transfer for Underwater Monocular Depth Estimation
Jinghe Yang*, Mingming Gong, Ye Pu
[pdf]
[DOI]

Robust Nearest Neighbors for Source-Free Domain Adaptation under Class Distribution Shift
Antonio Tejero-de-Pablos*, Riku Togashi, Mayu Otani, Shin'ichi Satoh
[pdf]
[DOI]

Chains of Diffusion Models
Yanheng Wei*, Lianghua Huang*, Zhi-Fan Wu, Wei Wang, Yu Liu, Mingda Jia, Shuailei Ma
[pdf]
[DOI]

Time-Efficient and Identity-Consistent Virtual Try-On Using A Variant of Altered Diffusion Models
Phuong Hoang Dam*, Jihoon Jeong*, Anh T Tran*, Daeyoung Kim*
[pdf]
[DOI]

Feature Diversification and Adaptation for Federated Domain Generalization
Seunghan Yang*, Seokeon Choi, Hyunsin Park, Sungha Choi, Simyung Chang, Sungrack Yun
[pdf]
[DOI]

Grounding Image Matching in 3D with MASt3R
Vincent Leroy*, Yohann Cabon, Jerome Revaud
[pdf]
[DOI]

TP2O: Creative Text Pair-to-Object Generation using Balance Swap-Sampling
Jun Li*, Zedong Zhang, Jian Yang
[pdf]
[DOI]

RoDUS: Robust Decomposition of Static and Dynamic Elements in Urban Scenes
Thang-Anh-Quan Nguyen*, Luis G Roldao Jimenez*, Nathan Piasco*, Moussab Bennehar*, Dzmitry Tsishkou*
[pdf]
[DOI]

RecurrentBEV: A Long-term Temporal Fusion Framework for Multi-view 3D Detection
Ming Chang, Xishan Zhang*, Rui Zhang, Zhipeng Zhao, Guanhua He, Shaoli Liu
[pdf]
[DOI]

Efficient Bias Mitigation Without Privileged Information
Mateo Espinosa Zarlenga*, Swami Sankaranarayanan, Jerone T. A. Andrews, Zohreh Shams, Mateja Jamnik, Alice Xiang
[pdf]
[DOI]

MC-PanDA: Mask Confidence for Panoptic Domain Adaptation
Ivan Martinović*, Josip Šarić, Siniša Šegvić
[pdf]
[DOI]

Learning Neural Deformation Representation for 4D Dynamic Shape Generation
Gyojin Han*, Jiwan Hur, Jaehyun Choi, Junmo Kim*
[pdf]
[DOI]

Dynamic Guidance Adversarial Distillation with Enhanced Teacher Knowledge
Hyejin Park, Dongbo Min*
[pdf]
[DOI]

Decomposition Betters Tracking Everything Everywhere
Rui Li, Dong Liu*
[pdf]
[DOI]

Straightforward Layer-wise Pruning for More Efficient Visual Adaptation
Ruizi Han*, Jinglei Tang*
[pdf]
[DOI]

Synchronization is All You Need: Exocentric-to-Egocentric Transfer for Temporal Action Segmentation with Unlabeled Synchronized Video Pairs
Camillo Quattrocchi*, Antonino Furnari, Daniele Di Mauro, Mario Valerio Giuffrida, Giovanni Maria Farinella
[pdf]
[DOI]

LAPT: Label-driven Automated Prompt Tuning for OOD Detection with Vision-Language Models
Yabin Zhang*, Wenjie Zhu, Chenhang He, Lei Zhang*
[pdf]
[DOI]

Domain Shifting: A Generalized Solution for Heterogeneous Cross-Modality Person Re-Identification
Yan Jiang, Xu Cheng*, Hao Yu, Xingyu Liu, Haoyu Chen, Guoying Zhao
[pdf]
[DOI]

Self-Supervised Video Desmoking for Laparoscopic Surgery
Renlong Wu, Zhilu Zhang*, Shuohao Zhang, Longfei Gou, Haobin Chen, Lei Zhang, Hao Chen*, Wangmeng Zuo
[pdf]
[DOI]

Removing Rows and Columns of Tokens in Vision Transformer enables Faster Dense Prediction without Retraining
Diwei Su, cheng fei, Jianxu Luo*
[pdf]
[DOI]

Continuity Preserving Online CenterLine Graph Learning
Yunhui Han, Kun Yu, Zhiwei Li*
[pdf]
[DOI]

Decomposition of Neural Discrete Representations for Large-Scale 3D Mapping
Minseong Park, Suhan Woo, Euntai Kim*
[pdf]
[DOI]

MirrorGaussian: Reflecting 3D Gaussians for Reconstructing Mirror Reflections
Jiayue Liu, Xiao Tang, Freeman Cheng, Zihao Yang, Zhihao Li*, Jianzhuang Liu, Yi Huang, Jiaqi Lin, Shiyong Liu, Xiaofei Wu, Songcen Xu, Chun Yuan*
[pdf]
[DOI]

Leveraging Representations from Intermediate Encoder-blocks for Synthetic Image Detection
Christos Koutlis*, Symeon Papadopoulos
[pdf]
[DOI]

Exploring Vulnerabilities in Spiking Neural Networks: Direct Adversarial Attacks on Raw Event Data
Yanmeng Yao, Xiaohan Zhao, Bin Gu*
[pdf]
[DOI]

HSR: Holistic 3D Human-Scene Reconstruction from Monocular Videos
Lixin Xue*, Chen Guo, Chengwei Zheng, Fangjinhua Wang, Tianjian Jiang, Hsuan-I Ho, Manuel Kaufmann, Jie Song, Otmar Hilliges
[pdf]
[DOI]

Online Video Quality Enhancement with Spatial-Temporal Look-up Tables
Zefan Qu, Xinyang Jiang*, Yifan Yang, Dongsheng Li, Cairong Zhao*
[pdf]
[DOI]

PARIS3D: Reasoning-based 3D Part Segmentation Using Large Multimodal Model
Amrin Kareem*, Jean Lahoud, Hisham Cholakkal*
[pdf]
[DOI]

Self-Rectifying Diffusion Sampling with Perturbed-Attention Guidance
Donghoon Ahn, Hyoungwon Cho, Jaewon Min, Jungwoo Kim, Wooseok Jang, SeonHwa Kim, Hyun Hee Park, Kyong Hwan Jin*, Seungryong Kim*
[pdf]
[DOI]

Localization and Expansion: A Decoupled Framework for Point Cloud Few-shot Semantic Segmentation
Zhaoyang Li*, Yuan Wang, Wangkai Li, Rui Sun, Tianzhu Zhang
[pdf]
[DOI]

Think before Placement: Common Sense Enhanced Transformer for Object Placement
Yaxuan Qin, Jiayu Xu, Ruiping Wang*, Xilin Chen
[pdf]
[DOI]

Oulu Remote-photoplethysmography Physical Domain Attacks Database (ORPDAD)
Marko Savic, Guoying Zhao*
[pdf]
[DOI]

Leveraging Imperfect Restoration for Data Availability Attack
YI HUANG*, Jeremy Styborski*, Mingzhi Lyu*, Fan Wang*, Wai-Kin Adams Kong*
[pdf]
[DOI]

3D Weakly Supervised Semantic Segmentation with 2D Vision-Language Guidance
Xiaoxu Xu, Yitian Yuan, Jinlong Li, Qiudan Zhang, Zequn Jie, Lin Ma, Hao Tang, Nicu Sebe, Xu Wang*
[pdf]
[DOI]

Open-set Domain Adaptation via Joint Error based Multi-class Positive and Unlabeled Learning
Dexuan Zhang*, Thomas Westfechtel, Tatsuya Harada
[pdf]
[DOI]

DoubleTake: Geometry Guided Depth Estimation
Mohamed Sayed*, Filippo Aleotti, Jamie Watson, Zawar Qureshi, Guillermo Garcia-Hernando, Gabriel Brostow, Sara Vicente, Michael Firman
[pdf]
[DOI]

Empowering Embodied Visual Tracking with Visual Foundation Models and Offline RL
Fangwei Zhong*, Kui Wu, Hai Ci, Chu-ran Wang, Hao Chen
[pdf]
[DOI]

Street Gaussians: Modeling Dynamic Urban Scenes with Gaussian Splatting
Yunzhi Yan*, Haotong Lin, Chenxu Zhou, Weijie Wang, Haiyang Sun, Kun Zhan, Xianpeng Lang, Xiaowei Zhou, Sida Peng*
[pdf]
[DOI]

Images are Achilles' Heel of Alignment: Exploiting Visual Vulnerabilities for Jailbreaking Multimodal Large Language Models
Yifan Li*, hangyu guo, Kun Zhou, Wayne Xin Zhao, Ji-Rong Wen
[pdf]
[DOI]

Edge-Guided Fusion and Motion Augmentation for Event-Image Stereo
Fengan Zhao*, Qianang Zhou, Junlin Xiong*
[pdf]
[DOI]

MetaWeather: Few-Shot Weather-Degraded Image Restoration
Youngrae Kim*, Younggeol Cho, Thanh-Tung Nguyen, Seunghoon Hong, Dongman Lee*
[pdf]
[DOI]

CPT-VR: Improving Surface Rendering via Closest Point Transform with View-Reflection Appearance
Zhipeng Hu, Yongqiang Zhang*, Chen Liu, Lincheng Li*, Sida Peng, Xiaowei Zhou, Changjie Fan, Xin Yu
[pdf]
[DOI]

"Close, But Not There: Boosting Geographic Distance Sensitivity in Visual Place Recognition"
Sergio Izquierdo*, Javier Civera*
[pdf]
[DOI]

HiFi-123: Towards High-fidelity One Image to 3D Content Generation
Wangbo Yu*, Li Yuan, Yan-Pei Cao, Xiangjun Gao, Xiaoyu Li, Wenbo Hu, Long Quan, Ying Shan, Yonghong Tian
[pdf]
[DOI]

Revisiting Adaptive Cellular Recognition Under Domain Shifts: A Contextual Correspondence View
Jianan Fan*, Dongnan Liu, Canran Li, Hang Chang, Heng Huang, Filip Braet, Mei Chen, Weidong Cai*
[pdf]
[DOI]

Good Teachers Explain: Explanation-Enhanced Knowledge Distillation
Amin Parchami-Araghi*, Moritz Böhle, Sukrut Rao, Bernt Schiele
[pdf]
[DOI]

Stepping Stones: A Progressive Training Strategy for Audio-Visual Semantic Segmentation
Juncheng Ma, Peiwen Sun, Yaoting Wang, Di Hu*
[pdf]
[DOI]

FRDiff : Feature Reuse for Universal Training-free Acceleration of Diffusion Models
Junhyuk So, Jungwon Lee, Eunhyeok Park*
[pdf]
[DOI]

Möbius Transform for Mitigating Perspective Distortions in Representation Learning
Prakash Chandra Chhipa*, Meenakshi Subhash Chippa, Kanjar De, Rajkumar Saini, Marcus Liwicki, Mubarak Shah
[pdf]
[DOI]

TAG: Text Prompt Augmentation for Zero-Shot Out-of-Distribution Detection
Xixi Liu*, Christopher Zach
[pdf]
[DOI]

CVT-Occ: Cost Volume Temporal Fusion for 3D Occupancy Prediction
Zhangchen Ye, Tao Jiang, Chenfeng Xu, Yiming Li, Hang Zhao*
[pdf]
[DOI]

SPVLoc: Semantic Panoramic Viewport Matching for 6D Camera Localization in Unseen Environments
Niklas Gard*, Anna Hilsmann, Peter Eisert
[pdf]
[DOI]

Continual Learning and Unknown Object Discovery in 3D Scenes via Self-Distillation
Mohamed El Amine Boudjoghra*, Jean Lahoud, Salman Khan, Hisham Cholakkal, Rao M Anwer, Fahad Shahbaz Khan
[pdf]
[DOI]

DiffCD: A Symmetric Differentiable Chamfer Distance for Neural Implicit Surface Fitting
Linus Härenstam-Nielsen*, Lu Sang, Abhishek Saroha, Nikita Araslanov*, Daniel Cremers*
[pdf]
[DOI]

Lost and Found: Overcoming Detector Failures in Online Multi-Object Tracking
Lorenzo Vaquero*, Yihong Xu, Xavier Alameda-Pineda, Victor M. Brea, Manuel Mucientes
[pdf]
[DOI]

Local Occupancy-Enhanced Object Grasping with Multiple Triplanar Projection
Kangqi Ma*, Hao Dong, Yadong Mu
[pdf]
[DOI]

Region-Native Visual Tokenization
Mengyu Wang*, Yuyao Huang, Henghui Ding, Xinlong Wang, Tiejun Huang, Yao Zhao, Yunchao Wei, Shuicheng Yan
[pdf]
[DOI]

SparseCraft: Few-Shot Neural Reconstruction through Stereopsis Guided Geometric Linearization
Mae Younes*, Amine Ouasfi, Adnane Boukhayma
[pdf]
[DOI]

Sketch2Vox: Learning 3D Reconstruction from a Single Monocular Sketch Image
Fei Wang*
[pdf]
[DOI]

DGE: Direct Gaussian 3D Editing by Consistent Multi-view Editing
Minghao Chen*, Iro Laina, Andrea Vedaldi
[pdf]
[DOI]

The Lottery Ticket Hypothesis in Denoising: Towards Semantic-Driven Initialization
Jiafeng Mao*, Xueting Wang, Kiyoharu Aizawa
[pdf]
[DOI]

Diffusion for Out-of-Distribution Detection on Road Scenes and Beyond
Silvio Galesso*, Philipp Schröppel*, Hssan Driss, Thomas Brox
[pdf]
[DOI]

Rethinking Directional Parameterization in Neural Implicit Surface Reconstruction
Zijie Jiang*, Tianhan Xu*, Hiroharu Kato
[pdf]
[DOI]

A Comprehensive Study of Multimodal Large Language Models for Image Quality Assessment
Tianhe Wu, Kede Ma*, Jie Liang, Yujiu Yang*, Lei Zhang
[pdf]
[DOI]

Semi-Supervised Teacher-Reference-Student Architecture for Action Quality Assessment
Wulian Yun, Mengshi Qi, Fei Peng, Huadong Ma*
[pdf]
[DOI]

Efficient Neural Video Representation with Temporally Coherent Modulation
Seungjun Shin*, Suji Kim*, Dokwan Oh
[pdf]
[DOI]

Ref-AVS: Refer and Segment Objects in Audio-Visual Scenes
Yaoting Wang, Peiwen Sun, Dongzhan Zhou, Guangyao Li, Honggang Zhang, Di Hu*
[pdf]
[DOI]

DreamScene: 3D Gaussian-based Text-to-3D Scene Generation via Formation Pattern Sampling
Haoran Li, Haolin Shi, Wenli Zhang, Wenjun Wu, Yong Liao*, Lin Wang, Lik-Hang Lee, Peng Yuan Zhou*
[pdf]
[DOI]

Multi-modal Crowd Counting via a Broker Modality
Haoliang Meng, Xiaopeng Hong*, Chenhao Wang, Miao Shang, Wangmeng Zuo
[pdf]
[DOI]

FastPCI: Motion-Structure Guided Fast Point Cloud Frame Interpolation
tianyu zhang, Guocheng Qian, Jin Xie*, Jian Yang
[pdf]
[DOI]

Made to Order: Discovering monotonic temporal changes via self-supervised video ordering
Charig Yang*, Weidi Xie, Andrew Zisserman
[pdf]
[DOI]

PARE-Net: Position-Aware Rotation-Equivariant Networks for Robust Point Cloud Registration
Runzhao Yao, Shaoyi Du*, Wenting Cui, Canhui Tang, Chengwu Yang
[pdf]
[DOI]

Open-Vocabulary RGB-Thermal Semantic Segmentation
GuoQiang Zhao, JunJie Huang, Xiaoyun Yan*, Zhaojing Wang, Junwei Tang, Yangjun Ou, Xinrong Hu, Tao Peng
[pdf]
[DOI]

MeshVPR: Citywide Visual Place Recognition Using 3D Meshes
Gabriele Berton*, Lorenz Junglas, Riccardo Zaccone, Thomas Pollok, Barbara Caputo, Carlo Masone
[pdf]
[DOI]

Can Textual Semantics Mitigate Sounding Object Segmentation Preference?
Yaoting Wang, Peiwen Sun, Yuanchao Li, Honggang Zhang, Di Hu*
[pdf]
[DOI]

Concise Plane Arrangements for Low-Poly Surface and Volume Modelling
Raphael Sulzer, Florent Lafarge*
[pdf]
[DOI]

KeypointDETR: An End-to-End 3D Keypoint Detector
Hairong Jin, Yuefan Shen, Jianwen Lou, Kun Zhou, Youyi Zheng*
[pdf]
[DOI]

ViPer: Visual Personalization of Generative Models via Individual Preference Learning
Sogand Salehi*, Mahdi Shafiei, Roman Bachmann, Teresa Yeo, Amir Zamir
[pdf]
[DOI]

MLPHand: Real Time Multi-View 3D Hand Reconstruction via MLP Modeling
Jian Yang, Jiakun Li, Guoming Li, Huaiyu Wu, Zhen Shen, Zhaoxin Fan*
[pdf]
[DOI]

uCAP: An Unsupervised Prompting Method for Vision-Language Models
A. Tuan Nguyen*, Kai Sheng Tai, Bor-Chun Chen, Satya Narayan Shukla, Hanchao Yu, Philip Torr, Tai-Peng Tian, Ser-Nam Lim
[pdf]
[DOI]

LHRS-Bot: Empowering Remote Sensing with VGI-Enhanced Large Multimodal Language Model
Dilxat Muhtar, Zhenshi Li, Feng Gu, Xueliang Zhang*, Pengfeng Xiao
[pdf]
[DOI]

How Far Can a 1-Pixel Camera Go? Solving Vision Tasks using Photoreceptors and Computationally Designed Visual Morphology
Andrei Atanov*, Rishubh Singh, Jiawei Fu, Isabella Yu, Andrew Spielberg, Amir Zamir
[pdf]
[DOI]

MONTAGE: Monitoring Training for Attribution of Generative Diffusion Models
Jonathan Brokman*, Omer Hofman, Roman Vainshtein, Amit Giloni, Toshiya Shimizu, Inderjeet Singh, Oren Rachmil, Alon Zolfi, Asaf Shabtai, Yuki Unno, Hisashi Kojima
[pdf]
[DOI]

Affective Visual Dialog: A Large-Scale Benchmark for Emotional Reasoning Based on Visually Grounded Conversations
Kilichbek Haydarov*, Xiaoqian Shen, Avinash Madasu, Mahmoud Salem, Li-Jia Li, Gamaleldin F Elsayed, Mohamed Elhoseiny
[pdf]
[DOI]

Watching it in Dark: A Target-aware Representation Learning Framework for High-Level Vision Tasks in Low Illumination
Yunan Li*, Yihao Zhang, Shoude Li, Long Tian, DOU QUAN, Chaoneng Li, Qiguang Miao*
[pdf]
[DOI]

Self-supervised visual learning from interactions with objects
Arthur Aubret*, Céline Teulière, Jochen Triesch
[pdf]
[DOI]

OP-Align: Object-level and Part-level Alignment for Self-supervised Category-level Articulated Object Pose Estimation
Yuchen Che*, Ryo Furukawa, Asako Kanezaki
[pdf]
[DOI]

BAFFLE: A Baseline of Backpropagation-Free Federated Learning
Haozhe Feng*, Tianyu Pang*, Chao Du, Wei Chen*, Shuicheng Yan, Min Lin
[pdf]
[DOI]

Sequential Representation Learning via Static-Dynamic Conditional Disentanglement
Mathieu Cyrille Simon*, Pascal Frossard, Christophe De Vleeschouwer
[pdf]
[DOI]

OmniNOCS: A unified NOCS dataset and model for 3D lifting of 2D objects
Akshay Krishnan*, Abhijit Kundu*, Kevis-Kokitsi Maninis, James Hays, Matthew Brown
[pdf]
[DOI]

3R-INN: How to be climate friendly while consuming/delivering videos?
ZOUBIDA AMEUR*, Claire-Helene Demarty, Olivier LE MEUR, Daniel Menard
[pdf]
[DOI]

Rethinking Deep Unrolled Model for Accelerated MRI Reconstruction
Bingyu Xin*, Meng Ye, Leon Axel, Dimitris N. Metaxas
[pdf]
[DOI]

Towards Robust Full Low-bit Quantization of Super Resolution Networks
Denis S. Makhov*, Irina Zhelavskaya, Ruslan Ostapets, Dehua Song, Kirill Solodskikh
[pdf]
[DOI]

Omni6DPose: A Benchmark and Model for Universal 6D Object Pose Estimation and Tracking
Jiyao Zhang, Weiyao Huang, Bo Peng, Mingdong Wu, Fei Hu, Zijian Chen, Bo Zhao, Hao Dong*
[pdf]
[DOI]

Diverse Text-to-3D Synthesis with Augmented Text Embedding
Uy Dieu Tran*, Minh N. Hoang Luu*, Phong Ha Nguyen*, Khoi Nguyen*, Binh-Son Hua*
[pdf]
[DOI]

Style-Extracting Diffusion Models for Semi-Supervised Histopathology Segmentation
Mathias Öttl*, Frauke Wilm, Jana Steenpass, Jingna Qiu, Matthias Rübner, Prof Arndt Hartmann, Matthias W. Beckmann, Peter Fasching, Andreas K Maier, Ramona Erber, Bernhard Kainz, Katharina Breininger
[pdf]
[DOI]

LLMCO4MR: LLMs-aided Neural Combinatorial Optimization for Ancient Manuscript Restoration from Fragments with Case Studies on Dunhuang
Yuqing Zhang, Hangqi Li, Shengyu Zhang*, Runzhong Wang, Baoyi He, Huaiyong Dou, Junchi Yan*, Yongquan Zhang, Fei Wu
[pdf]
[DOI]

Model Breadcrumbs: Scaling Multi-Task Model Merging with Sparse Masks
MohammadReza Davari*, Eugene Belilovsky
[pdf]
[DOI]

AdversariaLeak: External Information Leakage Attack Using Adversarial Samples on Face Recognition Systems
Roye Katzav*, Amit Giloni, Edita Grolman*, Hiroo Saito, Tomoyuki Shibata, Tsukasa Omino, Misaki Komatsu, Yoshikazu Hanatani, Yuval Elovici, Asaf Shabtai
[pdf]
[DOI]

iHuman: Instant Animatable Digital Humans From Monocular Videos
Pramish Paudel*, Anubhav Khanal, Danda Pani Paudel, Jyoti Tandukar, Ajad Chhatkuli
[pdf]
[DOI]

SphereHead: Stable 3D Full-head Synthesis with Spherical Tri-plane Representation
Heyuan Li*, Ce Chen, Tianhao Shi, Yuda Qiu, Sizhe An, Guanying CHEN, Xiaoguang Han*
[pdf]
[DOI]

Beyond Pixels: Semi-Supervised Semantic Segmentation with a Multi-scale Patch-based Multi-Label Classifier
Prantik Howlader*, Srijan Das, Hieu Le, Dimitris Samaras
[pdf]
[DOI]

Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering
Zeyu Liu, Weicong Liang, Zhanhao Liang, Chong Luo, Ji Li, Gao Huang, Yuhui Yuan*
[pdf]
[DOI]

Solving the inverse problem of microscopy deconvolution with a residual Beylkin-Coifman-Rokhlin neural network
Rui Li, Mikhail Kudryashev, Artur Yakimovich*
[pdf]
[DOI]

Face Reconstruction Transfer Attack as Out-of-Distribution Generalization
Yoon Gyo Jung*, Jaewoo Park, Xingbo Dong, Hojin Park, Andrew Beng Jin Teoh, Octavia Camps*
[pdf]
[DOI]

FreeZe: Training-free zero-shot 6D pose estimation with geometric and vision foundation models
Andrea Caraffa*, Davide Boscaini, Amir Hamza, Fabio Poiesi
[pdf]
[DOI]

Deep Diffusion Image Prior for Efficient OOD Adaptation in 3D Inverse Problems
Hyungjin Chung, Jong Chul Ye*
[pdf]
[DOI]

Weighting Pseudo-Labels via High-Activation Feature Index Similarity and Object Detection for Semi-Supervised Segmentation
Prantik Howlader*, Hieu Le, Dimitris Samaras
[pdf]
[DOI]

PartGLEE: A Foundation Model for Recognizing and Parsing Any Objects
Junyi Li, Junfeng Wu, Weizhi Zhao, Song Bai, Xiang Bai*
[pdf]
[DOI]

WTS: A Pedestrian-Centric Traffic Video Dataset for Fine-grained Spatial-Temporal Understanding
Quan Kong*, Yuki Kawana, Rajat Saini, Ashutosh Kumar, Jingjing Pan, Ta Gu, Yohei Ozao, Balazs Opra, Yoichi Sato, Norimasa Kobori
[pdf]
[DOI]

Spiking Wavelet Transformer
Yuetong Fang, Ziqing Wang, Lingfeng Zhang, Jiahang Cao, Honglei Chen, Renjing Xu*
[pdf]
[DOI]

WAVE: Warping DDIM Inversion Features for Zero-shot Text-to-Video Editing
Yutang Feng, Sicheng Gao*, Yuxiang Bao, Xiaodi Wang, Shumin Han*, Juan Zhang*, Baochang Zhang, Angela Yao
[pdf]
[DOI]

PDT Uav Target Detection Dataset for Pests and Diseases Tree
Mingle Zhou, Rui Xing, Delong Han, Zhiyong Qi, Gang Li*
[pdf]
[DOI]

Hypernetworks for Generalizable BRDF Representation
Fazilet Gokbudak*, Alejandro Sztrajman, Chenliang Zhou, Fangcheng Zhong, Rafal Mantiuk, A. Cengiz Oztireli
[pdf]
[DOI]

Photon Inhibition for Energy-Efficient Single-Photon Imaging
Lucas J Koerner*, Shantanu Gupta, Atul N Ingle, Mohit Gupta
[pdf]
[DOI]

COD: Learning Conditional Invariant Representation for Domain Adaptation Regression
Hao-Ran Yang, Chuan-Xian Ren*, You-Wei Luo
[pdf]
[DOI]

RANRAC: Robust Neural Scene Representations via Random Ray Consensus
Benno Buschmann*, Andreea Dogaru, Elmar Eisemann, Michael Weinmann, Bernhard Egger
[pdf]
[DOI]

LayerDiff: Exploring Text-guided Multi-layered Composable Image Synthesis via Layer-Collaborative Diffusion Model
Runhui Huang, Kaixin Cai, Jianhua Han, Xiaodan Liang*, Renjing Pei, Guansong Lu, Songcen Xu, Wei Zhang, Hang Xu
[pdf]
[DOI]

Characterizing Model Robustness via Natural Input Gradients
Adrian Rodriguez-Munoz*, Tongzhou Wang, Antonio Torralba
[pdf]
[DOI]

UpFusion: Novel View Diffusion from Unposed Sparse View Observations
Bharath Raj Nagoor Kani*, Hsin-Ying Lee, Sergey Tulyakov, Shubham Tulsiani
[pdf]
[DOI]

Four Ways to Improve Verbo-visual Fusion for Dense 3D Visual Grounding
Ozan Unal*, Christos Sakaridis, Suman Saha, Luc Van Gool
[pdf]
[DOI]

"SIMBA: Split Inference - Mechanisms, Benchmarks and Attacks"
Abhishek Singh*, Vivek Sharma, Rohan Sukumaran, John J Mose, Jeffrey K Chiu, Justin Yu, Ramesh Raskar
[pdf]
[DOI]

Tuning-Free Image Customization with Image and Text Guidance
Pengzhi Li, Qiang Nie, Ying Chen, Xi Jiang, Kai Wu, Yuhuan Lin, Yong Liu, Jinlong Peng, Chengjie Wang, Feng Zheng*
[pdf]
[DOI]

FairDomain: Achieving Fairness in Cross-Domain Medical Image Segmentation and Classification
Yu Tian*, Congcong Wen, Min Shi, Muhammad Muneeb Afzal, Hao Huang, Muhammad Osama Khan, Yan Luo, Yi Fang, Mengyu Wang
[pdf]
[DOI]

Emerging Property of Masked Token for Effective Pre-training
Hyesong Choi, Hunsang Lee, Seyoung Joung, Hyejin Park, Jiyeong Kim, Dongbo Min*
[pdf]
[DOI]

DQ-DETR: DETR with Dynamic Query for Tiny Object Detection
Yi-Xin Huang*, Hou-I Liu, Hong-Han Shuai, Wen-Huang Cheng
[pdf]
[DOI]

Track2Act: Predicting Point Tracks from Internet Videos enables Generalizable Robot Manipulation
Homanga Bharadhwaj*, Roozbeh Mottaghi, Abhinav Gupta, Shubham Tulsiani
[pdf]
[DOI]

SWAG: Splatting in the Wild images with Appearance-conditioned Gaussians
Hiba Dahmani*, Moussab Bennehar, Nathan Piasco, Luis G Roldao Jimenez, Dzmitry Tsishkou
[pdf]
[DOI]

Gaussian in the wild: 3D Gaussian Splatting for Unconstrained Image Collections
Dongbin Zhang*, Chuming Wang, Weitao Wang, Peihao Li, Minghan Qin, Haoqian Wang*
[pdf]
[DOI]

Few-shot Defect Image Generation based on Consistency Modeling
Qingfeng Shi, Jing Wei, Fei Shen*, Zhengtao Zhang
[pdf]
[DOI]

Taming CLIP for Fine-grained and Structured Visual Understanding of Museum Exhibits
Ada-Astrid Balauca*, Danda Pani Paudel, Kristina Toutanova, Luc Van Gool
[pdf]
[DOI]

CLIP-DPO: Vision-Language Models as a Source of Preference for Fixing Hallucinations in LVLMs
Yassine Ouali*, Adrian Bulat*, Brais Martinez, Georgios Tzimiropoulos
[pdf]
[DOI]

Masked Motion Prediction with Semantic Contrast for Point Cloud Sequence Learning
yuehui han*, Can Xu, Rui Xu, Jianjun Qian, Jin Xie
[pdf]
[DOI]

Prompt-Based Test-Time Real Image Dehazing: A Novel Pipeline
Zixuan Chen, Zewei He*, Ziqian Lu, Xuecheng Sun, Zheming Lu
[pdf]
[DOI]

Video Editing via Factorized Diffusion Distillation
Uriel Singer*, Amit Zohar*, Yuval Kirstain, Shelly Sheynin, Adam Polyak, Devi Parikh, Yaniv Taigman
[pdf]
[DOI]

Trackastra: Transformer-based cell tracking for live-cell microscopy
Benjamin Gallusser, Martin Weigert*
[pdf]
[DOI]

CogView3: Finer and Faster Text-to-Image Generation via Relay Diffusion
Wendi Zheng*, Jiayan Teng, Zhuoyi Yang, Weihan Wang, Jidong Chen, Xiaotao Gu, Yuxiao Dong*, Ming Ding*, Jie Tang*
[pdf]
[DOI]

SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers
Nanye Ma*, Mark Goldstein, Michael Albergo, Nicholas M Boffi, Eric Vanden-Eijnden*, Saining Xie*
[pdf]
[DOI]

Learn to Memorize and to Forget: A Continual Learning Perspective of Dynamic SLAM
Baicheng Li*, Zike Yan*, Dong Wu, Hanqing Jiang, Hongbin Zha*
[pdf]
[DOI]

Forecasting Future Videos from Novel Views via Disentangled 3D Scene Representation
Sudhir Yarram*, Junsong Yuan
[pdf]
[DOI]

GMM-IKRS: Gaussian Mixture Models for Interpretable Keypoint Refinement and Scoring
Emanuele Santellani*, Martin Zach, Christian Sormann, Mattia Rossi, Andreas Kuhn, Friedrich Fraundorfer
[pdf]
[DOI]

Get Your Embedding Space in Order: Domain-Adaptive Regression for Forest Monitoring
Sizhuo Li, Dimitri Gominski*, Martin Brandt, Xiaoye Tong, Philippe Ciais
[pdf]
[DOI]

ObjectDrop: Bootstrapping Counterfactuals for Photorealistic Object Removal and Insertion
Daniel Winter*, Matan Cohen, Shlomi Fruchter, Yael Pritch, Alex Rav-Acha, Yedid Hoshen*
[pdf]
[DOI]

CoDA: Instructive Chain-of-Domain Adaptation with Severity-Aware Visual Prompt Tuning
ZiYang Gong, FuHao Li, Yupeng Deng, Deblina Bhattacharjee, Xianzheng Ma*, Xiangwei Zhu*, Zhenming Ji*
[pdf]
[DOI]

Curved Diffusion: A Generative Model With Optical Geometry Control
Andrey Voynov*, Amir Hertz, Moab Arar, Shlomi Fruchter, Daniel Cohen-Or
[pdf]
[DOI]

Mini-Splatting: Representing Scenes with a Constrained Number of Gaussians
Guangchi Fang, Bing Wang*
[pdf]
[DOI]

MeshSegmenter: Zero-Shot Mesh Segmentation via Texture Synthesis
Ziming Zhong*, Yanyu Xu, Jing Li, Jiale Xu, Zhengxin Li, Chaohui Yu, Shenghua Gao
[pdf]
[DOI]

OTSeg: Multi-prompt Sinkhorn Attention for Zero-Shot Semantic Segmentation
Kwanyoung Kim, Yujin Oh, Jong Chul Ye*
[pdf]
[DOI]

Skeleton Recall Loss for Connectivity Conserving and Resource Efficient Segmentation of Thin Tubular Structures
Yannick Kirchhoff*, Maximilian R Rokuss*, Saikat Roy*, Balint Kovacs, Constantin Ulrich, Tassilo Wald, Maximilian Zenk, Philipp Vollmuth, Jens Kleesiek, Fabian Isensee, Klaus H. Maier-Hein
[pdf]
[DOI]

Conceptual Codebook Learning for Vision-Language Models
Yi Zhang*, Ke Yu, Siqi Wu, Zhihai He*
[pdf]
[DOI]

LingoQA: Video Question Answering for Autonomous Driving
Ana-Maria Marcu*, Long Chen, Jan Hünermann, Alice Karnsund, Benoit Hanotte, Prajwal Chidananda, Saurabh Nair, Vijay Badrinarayanan, Alex Kendall, Jamie Shotton, Elahe Arani, Oleg Sinavski
[pdf]
[DOI]

AnimateMe: 4D Facial Expressions via Diffusion Models
Dimitrios Gerogiannis*, Foivos Paraperas Papantoniou, Rolandos Alexandros Potamias, Alexandros Lattas, Stylianos Moschoglou, Stylianos Ploumpis, Stefanos Zafeiriou
[pdf]
[DOI]

HaloQuest: A Visual Hallucination Dataset for Advancing Multimodal Reasoning
Zhecan Wang, Garrett Bingham*, Adams Wei Yu, Quoc V. Le, Thang Luong, Golnaz Ghiasi
[pdf]
[DOI]

LATTE3D: Large-scale Amortized Text-To-Enhanced3D Synthesis
Kevin Xie*, Tianshi Cao, Jonathan P Lorraine, Jun Gao, James R Lucas, Antonio Torralba, Sanja Fidler, Xiaohui Zeng
[pdf]
[DOI]

PreSight: Enhancing Autonomous Vehicle Perception with City-Scale NeRF Priors
Tianyuan Yuan*, Yucheng Mao, Jiawei Yang, Yicheng LIU, Yue Wang, Hang Zhao*
[pdf]
[DOI]

Unveiling and Mitigating Memorization in Text-to-image Diffusion Models through Cross Attention
Jie Ren*, Yaxin Li, Shenglai Zeng, Han Xu, Lingjuan Lyu, Yue Xing, Jiliang Tang
[pdf]
[DOI]

iNeMo: Incremental Neural Mesh Models for Robust Class-Incremental Learning
Tom Fischer*, Yaoyao Liu, Artur Jesslen, Noor Ahmed, Prakhar Kaushik, Angtian Wang, Alan Yuille, Adam Kortylewski, Eddy Ilg
[pdf]
[DOI]

Context Diffusion: In-Context Aware Image Generation
Ivona Najdenkoska*, Animesh Sinha, Abhimanyu Dubey, Dhruv Mahajan, Vignesh Ramanathan, Filip Radenovic
[pdf]
[DOI]

Pose Guided Fine-Grained Sign Language Video Generation
Tongkai Shi, Lianyu Hu, Fanhua Shang, Jichao Feng, liu peidong, Wei Feng*
[pdf]
[DOI]

RAP: Retrieval-Augmented Planner for Adaptive Procedure Planning in Instructional Videos
Ali Zare*, Yulei Niu, Hammad Ayyubi, Shih-Fu Chang
[pdf]
[DOI]

Certifiably Robust Image Watermark
Zhengyuan Jiang*, Moyang Guo, Yuepeng Hu, Jinyuan Jia, Neil Zhenqiang Gong
[pdf]
[DOI]

Discover-then-Name: Task-Agnostic Concept Bottlenecks via Automated Concept Discovery
Sukrut Rao*, Sweta Mahajan*, Moritz Böhle, Bernt Schiele
[pdf]
[DOI]

Online Zero-Shot Classification with CLIP
Qi Qian*, Juhua Hu
[pdf]
[DOI]

SeA: Semantic Adversarial Augmentation for Last Layer Features from Unsupervised Representation Learning
Qi Qian*, Yuanhong Xu, Juhua Hu
[pdf]
[DOI]

Unlocking the Potential of Federated Learning: The Symphony of Dataset Distillation via Deep Generative Latents
Yuqi Jia, Saeed Vahidian*, Jingwei Sun, Jianyi Zhang, Vyacheslav Kungurtsev, Neil Zhenqiang Gong, Yiran Chen
[pdf]
[DOI]

Rethinking Fast Adversarial Training: A Splitting Technique To Overcome Catastrophic Overfitting
Masoumeh Zareapoor, Pourya Shamsolmoali*
[pdf]
[DOI]

Quality Assured: Rethinking Annotation Strategies in Imaging AI
Tim Rädsch*, Annika Reinke, Vivienn Weru, Minu D. Tizabi, Nicholas Heller, Fabian Isensee, Annette Kopp-Schneider, Lena Maier-Hein*
[pdf]
[DOI]

BRIDGE: Bridging Gaps in Image Captioning Evaluation with Stronger Visual Cues
Sara Sarto*, Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara
[pdf]
[DOI]

Enhancing Plausibility Evaluation for Generated Designs with Denoising Autoencoder
Jiajie Fan*, Amal Trigui*, Thomas Bäck, Hao Wang
[pdf]
[DOI]

Weakly-Supervised 3D Hand Reconstruction with Knowledge Prior and Uncertainty Guidance
Yufei Zhang*, Jeffrey Kephart, Qiang Ji*
[pdf]
[DOI]

3D Reconstruction of Objects in Hands without Real World 3D Supervision
Aditya Prakash*, Matthew Chang, Matthew Jin, Ruisen Tu, Saurabh Gupta
[pdf]
[DOI]

To Supervise or Not to Supervise: Understanding and Addressing the Key Challenges of Point Cloud Transfer Learning
Souhail Hadgi*, Lei Li, Maks Ovsjanikov
[pdf]
[DOI]

Parameterized Quasi-Physical Simulators for Dexterous Manipulations Transfer
Xueyi Liu*, Kangbo Lyu, jieqiong zhang, Tao Du, Li Yi*
[pdf]
[DOI]

3D Hand Pose Estimation in Everyday Egocentric Images
Aditya Prakash*, Ruisen Tu, Matthew Chang, Saurabh Gupta
[pdf]
[DOI]

Mitigating Perspective Distortion-induced Shape Ambiguity in Image Crops
Aditya Prakash*, Arjun Gupta, Saurabh Gupta
[pdf]
[DOI]

Towards Neuro-Symbolic Video Understanding
Minkyu Choi*, Harsh Goel, Mohammad Omama, Yunhao Yang, Sahil Shah, Sandeep Chinchali
[pdf]
[DOI]

Optimization-based Uncertainty Attribution Via Learning Informative Perturbations
Hanjing Wang*, Bashirul Azam Biswas, Qiang Ji
[pdf]
[DOI]

Context-Aware Action Recognition: Introducing a Comprehensive Dataset for Behavior Contrast
Tatsuya Sasaki*, Yoshiki Ito, Satoshi Kondo
[pdf]
[DOI]

Semi-supervised Segmentation of Histopathology Images with Noise-Aware Topological Consistency
Meilong Xu*, Xiaoling Hu, Saumya Gupta, Shahira Abousamra, Chao Chen
[pdf]
[DOI]

Adaptive Compressed Sensing with Diffusion-Based Posterior Sampling
Noam Elata*, Tomer Michaeli, Michael Elad
[pdf]
[DOI]

Instant Uncertainty Calibration of NeRFs Using a Meta-Calibrator
Niki Amini-Naieni*, Tomas Jakab, Andrea Vedaldi, Ronald Clark
[pdf]
[DOI]

MetaAT: Active Testing for Label-Efficient Evaluation of Dense Recognition Tasks
Sanbao Su, Xin Li*, Thang Doan, Sima Behpour, Wenbin He, Liang Gou, Fei Miao, Liu Ren
[pdf]
[DOI]

Salience-Based Adaptive Masking: Revisiting Token Dynamics for Enhanced Pre-training
Hyesong Choi, Hyejin Park, Kwang Moo Yi, Sungmin Cha, Dongbo Min*
[pdf]
[DOI]

Data Augmentation via Latent Diffusion for Saliency Prediction
Bahar Aydemir*, Deblina Bhattacharjee, Tong Zhang, Mathieu Salzmann, Sabine Süsstrunk
[pdf]
[DOI]

Explorative Inbetweening of Time and Space
Haiwen Feng*, Zheng Ding, Zhihao Xia, Simon Niklaus, Victoria Fernandez Abrevaya, Michael J. Black, Xuaner Zhang
[pdf]
[DOI]

A Diffusion Model for Simulation Ready Coronary Anatomy with Morpho-skeletal Control
Karim Kadry*, Shreya Gupta, Jonas Sogbadji, Michiel Schaap, Kersten Petersen, Takuya Mizukami, Carlos Collet, Farhad R. Nezami, Elazer R Edelman
[pdf]
[DOI]

Learning to Make Keypoints Sub-Pixel Accurate
Shinjeong Kim*, Marc Pollefeys, Daniel Barath
[pdf]
[DOI]

Imaging with Confidence: Uncertainty Quantification for High-dimensional Undersampled MR Images
Frederik Hoppe*, Claudio Mayrink Verdun, Hannah Sophie Laus, Sebastian Endt, Marion Irene Menzel, Felix Krahmer, Holger Rauhut
[pdf]
[DOI]

Generalizable Human Gaussians for Sparse View Synthesis
YoungJoong Kwon*, Baole Fang, Yixing Lu, Haoye Dong, Cheng Zhang, Francisco Vicente Carrasco, Albert Mosella-Montoro, Jianjin Xu, Shingo J Takagi, Daeil Kim, Aayush Prakash, Fernando de la Torre
[pdf]
[DOI]

DrivingDiffusion: Layout-Guided Multi-View Driving Scenarios Video Generation with Latent Diffusion Model
Li Xiaofan*, Zhang Yifu*, Ye Xiaoqing*
[pdf]
[DOI]

Evaluating the Adversarial Robustness of Semantic Segmentation: Trying Harder Pays Off
Levente Halmosi, Bálint Mohos, Márk Jelasity*
[pdf]
[DOI]

SkyScenes: A Synthetic Dataset for Aerial Scene Understanding
Sahil S Khose*, Anisha Pal, Aayushi Agarwal, . Deepanshi, Judy Hoffman, Prithvijit Chattopadhyay
[pdf]
[DOI]

Large-Scale Multi-Hypotheses Cell Tracking Using Ultrametric Contours Maps
Jordão Bragantini*, Merlin Lange, Loïc A Royer
[pdf]
[DOI]

GSD: View-Guided Gaussian Splatting Diffusion for 3D Reconstruction
Yuxuan Mu*, Xinxin Zuo, Chuan Guo, Yilin Wang, Juwei Lu, Xiaofei Wu, Songcen Xu, Peng Dai, Youliang Yan, Li Cheng
[pdf]
[DOI]

AdaDiff: Accelerating Diffusion Models through Step-Wise Adaptive Computation
Shengkun Tang*, Yaqing Wang, Caiwen Ding, Yi Liang, Yao Li, Dongkuan Xu
[pdf]
[DOI]

PFedEdit: Personalized Federated Learning via Automated Model Editing
Haolin Yuan*, William Paul, John Aucott, Philippe Burlina, Yinzhi Cao*
[pdf]
[DOI]

De-Confusing Pseudo-Labels in Source-Free Domain Adaptation
Idit Diamant*, Amir Rosenfeld, Idan Achituve, Jacob Goldberger, Arnon Netzer
[pdf]
[DOI]

GenerateCT: Text-Conditional Generation of 3D Chest CT Volumes
Ibrahim Ethem Hamamci*, Sezgin Er, Anjany Sekuboyina, Enis Simsar, Alperen Tezcan, Ayse Gulnihan Simsek, Sevval Nil Esirgun, Furkan Almas, Irem Dogan, Muhammed Furkan Dasdelen, Chinmay Prabhakar, Hadrien Reynaud, Sarthak Pati, Christian Bluethgen, Mehmet Kemal Ozdemir, Bjoern Menze
[pdf]
[DOI]

EraseDraw : Learning to Insert Objects by Erasing Them from Images
Alper Canberk*, Maksym Bondarenko, Ege Ozguroglu, Ruoshi Liu, Carl Vondrick
[pdf]
[DOI]

SuperFedNAS: Cost-Efficient Federated Neural Architecture Search for On-Device Inference
Alind Khare*, Animesh Agrawal, Aditya Annavajjala, Payman Behnam, Myungjin Lee, Hugo M Latapie, Alexey Tumanov
[pdf]
[DOI]

Towards Reliable Evaluation and Fast Training of Robust Semantic Segmentation Models
Francesco Croce*, Naman D. Singh, Matthias Hein*
[pdf]
[DOI]

Contrastive Region Guidance: Improving Grounding in Vision-Language Models without Training
David Wan*, Jaemin Cho, Elias Stengel-Eskin, Mohit Bansal
[pdf]
[DOI]

Keypoint Promptable Re-Identification
Vladimir Somers*, Alexandre Alahi, Christophe De Vleeschouwer
[pdf]
[DOI]

Merging and Splitting Diffusion Paths for Semantically Coherent Panoramas
Fabio Quattrini*, Vittorio Pippi, Silvia Cascianelli*, Rita Cucchiara
[pdf]
[DOI]

DynMF: Neural Motion Factorization for Real-time Dynamic View Synthesis with 3D Gaussian Splatting
Angelos Kratimenos*, Jiahui Lei, Kostas Daniilidis
[pdf]
[DOI]

Animal Avatars: Reconstructing Animatable 3D Animals from Casual Videos
Remy Sabathier*, David Novotny, Niloy Mitra
[pdf]
[DOI]

Perceptual Evaluation of Audio-Visual Synchrony Grounded in Viewers’ Opinion Scores
Lucas Goncalves, Prashant Mathur*, Chandrashekhar Lavania, Metehan Cekic, Marcello Federico, Kyu Han
[pdf]
[DOI]

MMVR: Millimeter-wave Multi-View Radar Dataset and Benchmark for Indoor Perception
Mohammad Mahbubur Rahman, Ryoma Yataka, Sorachi Kato, Pu Wang*, Peizhao Li, Adriano Cardace, Petros Boufounos
[pdf]
[DOI]

Training A Secure Model against Data-Free Model Extraction
Zhenyi Wang*, Li Shen*, junfeng guo, Tiehang Duan, Siyu Luan, Tongliang Liu, Mingchen Gao
[pdf]
[DOI]

EpipolarGAN: Omnidirectional Image Synthesis with Explicit Camera Control
Christopher May*, Daniel Aliaga
[pdf]
[DOI]

TriNeRFLet: A Wavelet Based Triplane NeRF Representation
Rajaei Khatib*, Raja Giryes*
[pdf]
[DOI]

EgoBody3M: Egocentric Body Tracking on a VR Headset using a Diverse Dataset
Amy Zhao, Chengcheng Tang, Lezi Wang, Yijing Li, Mihika Dave, Lingling Tao*, Christopher D. Twigg, Robert Y. Wang
[pdf]
[DOI]

Photorealistic Video Generation with Diffusion Models
Agrim Gupta*, Lijun Yu, Kihyuk Sohn, Xiuye Gu, Meera Hahn, Li Fei-Fei, Irfan Essa, Lu Jiang, Jose Lezama
[pdf]
[DOI]

RAVE: Residual Vector Embedding for CLIP-Guided Backlit Image Enhancement
Tatiana Gaintseva*, Martin Benning, Gregory Slabaugh*
[pdf]
[DOI]

TIBET: Identifying and Evaluating Biases in Text-to-Image Generative Models
Aditya Chinchure*, Pushkar Shukla*, Gaurav Bhatt, Kiri Salij, Kartik Hosanagar, Leonid Sigal, Matthew Turk
[pdf]
[DOI]

Object-Aware Query Perturbation for Cross-Modal Image-Text Retrieval
Naoya Sogi*, Takashi Shibata*, Makoto Terao*
[pdf]
[DOI]

DECIDER: Leveraging Foundation Model Priors for Improved Model Failure Detection and Explanation
Rakshith Subramanyam*, Kowshik Thopalli*, Vivek Sivaraman Narayanaswamy, Jayaraman J. Thiagarajan
[pdf]
[DOI]

Ex2Eg-MAE: A Framework for Adaptation of Exocentric Video Masked Autoencoders for Egocentric Social Role Understanding
Minh Tran*, Yelin Kim, Che-Chun Su, Min Sun, Cheng-Hao Kuo, Mohammad Soleymani
[pdf]
[DOI]

Self-Supervised Audio-Visual Soundscape Stylization
Tingle Li*, Renhao Wang, Po-Yao Huang, Andrew Owens, Gopala Krishna Anumanchipalli
[pdf]
[DOI]

SAVE: Protagonist Diversification with Structure Agnostic Video Editing
Yeji Song*, Wonsik Shin, Junsoo Lee, Jeesoo Kim, Nojun Kwak*
[pdf]
[DOI]

VideoAgent: Long-form Video Understanding with Large Language Model as Agent
Xiaohan Wang*, Yuhui Zhang, Orr Zohar, Serena Yeung-Levy
[pdf]
[DOI]

Meta-optimized Angular Margin Contrastive Framework for Video-Language Representation Learning
Thong Thanh Nguyen*, Yi Bin, Xiaobao Wu, Xinshuai Dong, Zhiyuan Hu, Khoi M Le, Cong-Duy Nguyen, See Kiong Ng, Anh Tuan Luu
[pdf]
[DOI]

Source-Free Domain-Invariant Performance Prediction
Ekaterina Khramtsova*, Mahsa Baktashmotlagh, Guido Zuccon, Xi Wang, Mathieu Salzmann
[pdf]
[DOI]

Improving Robustness to Model Inversion Attacks via Sparse Coding Architectures
Sayanton V. Dibbo*, Adam Breuer, Juston Moore, Michael Teti
[pdf]
[DOI]

Constructing Concept-based Models to Mitigate Spurious Correlations with Minimal Human Effort
Jeeyung Kim*, Ze Wang, Qiang Qiu
[pdf]
[DOI]

Direct Distillation between Different Domains
Jialiang Tang, Shuo Chen*, Gang Niu, Hongyuan Zhu, Joey Tianyi Zhou, Chen Gong*, Masashi Sugiyama
[pdf]
[DOI]

Contrastive ground-level image and remote sensing pre-training improves representation learning for natural world imagery
Andy V Huynh*, Lauren Gillespie, Jael Lopez-Saucedo, Claire Tang, Rohan Sikand, Moisés Expósito-Alonso
[pdf]
[DOI]

V-Trans4Style: Visual Transition Recommendation for Video Production Style Adaptation
Pooja Guhan*, Tsung-Wei Huang, Guan-Ming Su, Subhadra Gopalakrishnan, Dinesh Manocha
[pdf]
[DOI]

GRiT: A Generative Region-to-text Transformer for Object Understanding
Jialian Wu*, Jianfeng Wang, Zhengyuan Yang, Zhe Gan, Zicheng Liu, Junsong Yuan, Lijuan Wang
[pdf]
[DOI]

LRSLAM: Low-rank Representation of Signed Distance Fields in Dense Visual SLAM System
Hongbeen Park, Minjeong Park, Giljoo Nam, Jinkyu Kim*
[pdf]
[DOI]

Learning Representation for Multitask Learning through Self-Supervised Auxiliary Learning
Seokwon Shin, Hyungrok Do, Youngdoo Son*
[pdf]
[DOI]

Neural Poisson Solver: A Universal and Continuous Framework for Natural Signal Blending
Delong Wu, Hao Zhu, Qi Zhang, You Li, Xun Cao*, Zhan Ma*
[pdf]
[DOI]

Geometry Fidelity for Spherical Images
Anders Christensen*, Nooshin Mojab*, Khushman Patel, Karan Ahuja, Zeynep Akata, Ole Winther, Mar Gonzalez Franco, Andrea Colaco
[pdf]
[DOI]

BAGS: Blur Agnostic Gaussian Splatting through Multi-Scale Kernel Modeling
Cheng Peng*, Yutao Tang, Yifan Zhou, Nengyu Wang, Xijun Liu, Deming Li, Rama Chellappa
[pdf]
[DOI]

CroMo-Mixup: Augmenting Cross-Model Representations for Continual Self-Supervised Learning
Erum Mushtaq*, Duygu Nur Yaldiz, Yavuz Faruk Bakman, Jie Ding, Chenyang Tao, Dimitrios Dimitriadis, Salman Avestimehr
[pdf]
[DOI]

WoVoGen: World Volume-aware Diffusion for Controllable Multi-camera Driving Scene Generation
Jiachen Lu, Ze Huang, Zeyu Yang, Zhang Jiahui, Li Zhang*
[pdf]
[DOI]

Benchmarking Spurious Bias in Few-Shot Image Classifiers
Guangtao Zheng*, Wenqian Ye, Aidong Zhang
[pdf]
[DOI]

TurboEdit: Real-time text-based disentangled real image editing
Zongze Wu*, Nicholas I Kolkin, Jonathan Brandt, Richard Zhang, Eli Shechtman
[pdf]
[DOI]

Soft Shadow Diffusion (SSD): Physics-inspired Learning for 3D Computational Periscopy
Fadlullah A Raji*, John Murray-Bruce*
[pdf]
[DOI]

Augmented Neural Fine-tuning for Efficient Backdoor Purification
Nazmul Karim*, Abdullah Al Arafat, Umar Khalid, Zhishan Guo, Nazanin Rahnavard
[pdf]
[DOI]

REDIR: Refocus-free Event-based De-occlusion Image Reconstruction
Qi Guo, Hailong Shi*, Huan Li, Jinsheng Xiao, Xingyu Gao*
[pdf]
[DOI]

Free-Editor: Zero-shot Text-driven 3D Scene Editing
Nazmul Karim*, Hasan Iqbal, Umar Khalid, Chen Chen, Jing Hua
[pdf]
[DOI]

DPA-Net: Structured 3D Abstraction from Sparse Views via Differentiable Primitive Assembly
Fenggen Yu*, Yiming Qian, Xu Zhang, Francisca Gil-Ureta, Brian Jackson, Eric Bennett, Hao Zhang
[pdf]
[DOI]

An Empirical Study and Analysis of Text-to-Image Generation Using Large Language Model-Powered Textual Representation
Zhiyu Tan, Mengping Yang, Luozheng Qin , Hao Yang, Ye Qian , Qiang Zhou, Cheng Zhang, Hao Li*
[pdf]
[DOI]

Few-shot Class Incremental Learning with Attention-Aware Self-Adaptive Prompt
Chenxi Liu*, Zhenyi Wang, Tianyi Xiong, Ruibo Chen, Yihan Wu, junfeng guo, Heng Huang*
[pdf]
[DOI]

An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models
Liang Chen, Haozhe Zhao, Tianyu Liu, Shuai Bai, Junyang Lin, Chang Zhou, Baobao Chang*
[pdf]
[DOI]

Generalizable Symbolic Optimizer Learning
Xiaotian Song, Peng Zeng, Yanan Sun*, Andy Song
[pdf]
[DOI]

Online Continuous Generalized Category Discovery
Keon-Hee Park, Hakyung Lee, Kyungwoo Song*, Gyeong-Moon Park*
[pdf]
[DOI]

Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation
Shihao Zhao*, Shaozhe Hao, Bojia Zi, Huaizhe Xu, Kwan-Yee K. Wong*
[pdf]
[DOI]

Tackling Structural Hallucination in Image Translation with Local Diffusion
Seunghoi Kim*, Chen Jin, Tom Diethe, Matteo Figini, Henry FJ Tregidgo, Asher Mullokandov, Philip A Teare, Daniel Alexander
[pdf]
[DOI]

Hierarchical Separable Video Transformer for Snapshot Compressive Imaging
Ping Wang*, Yulun Zhang, Lishun Wang, Xin Yuan*
[pdf]
[DOI]

Unified Medical Image Pre-training in Language-Guided Common Semantic Space
Xiaoxuan He, Yifan Yang, Xinyang Jiang, Xufang Luo*, Haoji Hu, Siyun Zhao, Dongsheng Li, Yuqing Yang, Lili Qiu
[pdf]
[DOI]

On the Vulnerability of Skip Connections to Model Inversion Attacks
Jun Hao Koh*, Sy-Tuyen Ho, Ngoc-Bao Nguyen, Ngai-Man Cheung
[pdf]
[DOI]

Adversarial Robustification via Text-to-Image Diffusion Models
Daewon Choi, Jongheon Jeong, Huiwon Jang, Jinwoo Shin*
[pdf]
[DOI]

Overcome Modal Bias in Multi-modal Federated Learning via Balanced Modality Selection
Yunfeng FAN*, Wenchao Xu*, Haozhao Wang, Fushuo Huo, Jinyu Chen, Song Guo
[pdf]
[DOI]

Comprehensive Attribution: Inherently Explainable Vision Model with Feature Detector
Xianren Zhang, Dongwon Lee, Suhang Wang*
[pdf]
[DOI]

Reinforcement Learning via Auxillary Task Distillation
Abhinav N Harish*, Larry Heck, Josiah P Hanna, Zsolt Kira, Andrew Szot
[pdf]
[DOI]

DHR: Dual Features-Driven Hierarchical Rebalancing in Inter- and Intra-Class Regions for Weakly-Supervised Semantic Segmentation
Sanghyun Jo, Fei Pan, In-Jae Yu, Kyungsu Kim*
[pdf]
[DOI]

Pre-trained Visual Dynamics Representations for Efficient Policy Learning
Hao Luo*, Bohan Zhou, Zongqing Lu*
[pdf]
[DOI]

View-Consistent Hierarchical 3D Segmentation Using Ultrametric Feature Fields
Haodi He, Colton Stearns, Adam Harley, Leonidas Guibas*
[pdf]
[DOI]

Plug and Play: A Representation Enhanced Domain Adapter for Collaborative Perception
Tianyou Luo*, Quan Yuan*, Yuchen Xia, Guiyang Luo, Yujia Yang, Jinglin Li
[pdf]
[DOI]

Follow the Rules: Reasoning for Video Anomaly Detection with Large Language Models
Yuchen Yang*, Kwonjoon Lee, Behzad Dariush, Yinzhi Cao*, Shao-Yuan Lo*
[pdf]
[DOI]

SAM4MLLM: Enhance Multi-Modal Large Language Model for Referring Expression Segmentation
Yi-Chia Chen, Wei-Hua Li, Cheng Sun, Yu-Chiang Frank Wang, Chu-Song Chen*
[pdf]
[DOI]

TTD: Text-Tag Self-Distillation Enhancing Image-Text Alignment in CLIP to Alleviate Single Tag Bias
Sanghyun Jo, Soohyun Ryu, Sungyub Kim, Eunho Yang, Kyungsu Kim*
[pdf]
[DOI]

Learning Quantized Adaptive Conditions for Diffusion Models
Yuchen Liang*, Yuchuan Tian, Lei Yu, Huaao Tang, Jie Hu, Xiangzhong Fang, Hanting Chen*
[pdf]
[DOI]

STAMP: Outlier-Aware Test-Time Adaptation with Stable Memory Replay
Yu Yongcan, Lijun Sheng, Ran He, Jian Liang*
[pdf]
[DOI]

Remove Projective LiDAR Depthmap Artifacts via Exploiting Epipolar Geometry
Shengjie Zhu*, Girish Chandar Ganesan, Abhinav Kumar, Xiaoming Liu
[pdf]
[DOI]

Accelerating Online Mapping and Behavior Prediction via Direct BEV Feature Attention
Xunjiang Gu, Guanyu Song, Igor Gilitschenski, Marco Pavone, Boris Ivanovic*
[pdf]
[DOI]

High-Fidelity Modeling of Generalizable Wrinkle Deformation
Jingfan Guo, Jae Shin Yoon, Shunsuke Saito, Takaaki Shiratori, Hyun Soo Park*
[pdf]
[DOI]

Instruction Tuning-free Visual Token Complement for Multimodal LLMs
Dongsheng Wang*, Jiequan Cui, Miaoge Li, Wang Lin, Bo Chen, Hanwang Zhang
[pdf]
[DOI]

Exploring Conditional Multi-Modal Prompts for Zero-shot HOI Detection
Ting Lei, Shaofeng Yin, Yuxin Peng, Yang Liu*
[pdf]
[DOI]

Training-free Video Temporal Grounding using Large-scale Pre-trained Models
Minghang Zheng, Xinhao Cai, Qingchao Chen, Yuxin Peng, Yang Liu*
[pdf]
[DOI]

Revisit Self-supervision with Local Structure-from-Motion
Shengjie Zhu*, Xiaoming Liu
[pdf]
[DOI]

FAMOUS: High-Fidelity Monocular 3D Human Digitization Using View Synthesis
Vishnu Mani Hema*, Shubhra Aich, Christian Haene, Jean-Charles Bazin, Fernando de la Torre
[pdf]
[DOI]

Efficient Learning of Event-based Dense Representation using Hierarchical Memories with Adaptive Update
Uday Kamal*, Saibal Mukhopadhyay
[pdf]
[DOI]

SNP: Structured Neuron-level Pruning to Preserve Attention Scores
KyungHwan Shim, Jaewoong Yun, Shinkook Choi*
[pdf]
[DOI]

Multi-Granularity Sparse Relationship Matrix Prediction Network for End-to-End Scene Graph Generation
lei wang, Zejian Yuan, Badong Chen*
[pdf]
[DOI]

Flash-Splat: 3D Reflection Removal with Flash Cues and Gaussian Splats
Mingyang Xie*, Haoming Cai, Sachin Shah, Yiran Xu, Brandon Y. Feng, Jia-Bin Huang, Christopher A. Metzler
[pdf]
[DOI]

PALM: Predicting Actions through Language Models
Sanghwan Kim*, Daoji Huang, Yongqin Xian, Otmar Hilliges, Luc Van Gool, Xi Wang
[pdf]
[DOI]

Motion Keyframe Interpolation for Any Human Skeleton using Point Cloud-based Human Motion Data Homogenisation
Clinton A Mo, Kun Hu*, Chengjiang Long, Dong Yuan, Zhiyong Wang
[pdf]
[DOI]

SwiftBrush v2: Make Your One-step Diffusion Model Better Than Its Teacher
Trung Tuan Dao*, Thuan Hoang Nguyen, Thanh Van Le, Duc H Vu, Khoi Nguyen, Cuong Pham, Anh T Tran*
[pdf]
[DOI]

Learning to Localize Actions in Instructional Videos with LLM-Based Multi-Pathway Text-Video Alignment
Yuxiao Chen*, Kai Li, Wentao Bao, Deep Patel, Yu Kong, Martin Renqiang Min, Dimitris N. Metaxas*
[pdf]
[DOI]

Improving Hyperbolic Representations via Gromov-Wasserstein Regularization
Yifei Yang, Wonjun Lee, Dongmian Zou*, Gilad Lerman
[pdf]
[DOI]

VSViG: Real-time Video-based Seizure Detection via Skeleton-based Spatiotemporal ViG
Yankun Xu*, Junzhe Wang, Yun-Hsuan Chen, Jie Yang, Wenjie Ming, Shuang Wang, Mohamad Sawan*
[pdf]
[DOI]

DiffSurf: A Transformer-based Diffusion Model for Generating and Reconstructing 3D Surfaces in Pose
Yusuke Yoshiyasu*, Leyuan Sun
[pdf]
[DOI]

Exploiting Supervised Poison Vulnerability to Strengthen Self-Supervised Defense
Jeremy Styborski*, Mingzhi Lyu*, Yi Huang*, Adams Kong*
[pdf]
[DOI]

Dense Hand-Object(HO) GraspNet with Full Grasping Taxonomy and Dynamics
Woojin Cho, Jihyun Lee, Minjae Yi, Minje Kim, Taeyun Woo, Donghwan Kim, Taewook Ha, Hyokeun Lee, Je-Hwan Ryu, Woontack Woo, Tae-Kyun (T-K) Kim*
[pdf]
[DOI]

Human Pose Recognition via Occlusion-Preserving Abstract Images
Saad Manzur*, Wayne B Hayes*
[pdf]
[DOI]

DA-BEV: Unsupervised Domain Adaptation for Bird's Eye View Perception
Kai Jiang*, Jiaxing Huang, Weiying Xie, Jie Lei, Yunsong Li, Ling Shao, Shijian Lu
[pdf]
[DOI]

SlimFlow: Training Smaller One-Step Diffusion Models with Rectified Flow
Yuanzhi Zhu*, Xingchao Liu, Qiang Liu*
[pdf]
[DOI]

PhysGen: Rigid-Body Physics-Grounded Image-to-Video Generation
Shaowei Liu, Zhongzheng Ren, Saurabh Gupta, Shenlong Wang*
[pdf]
[DOI]

Depth-Aware Blind Image Decomposition for Real-World Adverse Weather Recovery
Chao Wang*, Zhedong Zheng, Ruijie Quan, Yi Yang
[pdf]
[DOI]

DreamSampler: Unifying Diffusion Sampling and Score Distillation for Image Manipulation
Jeongsol Kim, Geon Yeong Park, Jong Chul Ye*
[pdf]
[DOI]

Reshaping the Online Data Buffering and Organizing Mechanism for Continual Test-Time Adaptation
Zhilin Zhu*, Xiaopeng Hong*, Zhiheng Ma, Weijun Zhuang, YaoHui Ma, Yong Dai, Yaowei Wang
[pdf]
[DOI]

Personalized Privacy Protection Mask Against Unauthorized Facial Recognition
Ka-Ho Chow*, Sihao Hu, Tiansheng Huang, Ling Liu
[pdf]
[DOI]

PosterLlama: Bridging Design Ability of Langauge Model to Content-Aware Layout Generation
Jaejung Seol, SeoJun Kim, Jaejun Yoo*
[pdf]
[DOI]

PreciseControl: Enhancing Text-To-Image Diffusion Models with Fine-Grained Attribute Control
Rishubh Parihar*, Sachidanand VS, Sabariswaran Mani, Tejan Karmali, Venkatesh Babu RADHAKRISHNAN
[pdf]
[DOI]

LG-Gaze: Learning Geometry-aware Continuous Prompts for Language-Guided Gaze Estimation
Pengwei Yin*, Jingjing Wang, Guanzhong Zeng, Di Xie, Jiang Zhu
[pdf]
[DOI]

Efficient Training with Denoised Neural Weights
Yifan Gong*, Zheng Zhan, Yanyu Li, Yerlan Idelbayev, Andrey Zharkov, Kfir Aberman, Sergey Tulyakov, Yanzhi Wang, Jian Ren
[pdf]
[DOI]

Learning the Unlearned: Mitigating Feature Suppression in Contrastive Learning
Jihai Zhang, Xiang Lan, Xiaoye Qu, Yu Cheng, Mengling Feng*, Bryan Hooi*
[pdf]
[DOI]

Integration of Global and Local Representations for Fine-grained Cross-modal Alignment
Seungwan Jin, Hoyoung Choi, Taehyung Noh, Kyungsik Han*
[pdf]
[DOI]

Local and Global Flatness for Federated Domain Generalization
Hao Yan, Yuhong Guo*
[pdf]
[DOI]

SRPose: Two-view Relative Pose Estimation with Sparse Keypoints
Rui Yin, Yulun Zhang, Zherong Pan, Jianjun Zhu, Cheng Wang, Biao Jia*
[pdf]
[DOI]

Deep Reward Supervisions for Tuning Text-to-Image Diffusion Models
Xiaoshi Wu, Yiming Hao, Manyuan Zhang*, Keqiang Sun, Zhaoyang Huang, Guanglu Song, Yu Liu, Hongsheng Li*
[pdf]
[DOI]

Paying More Attention to Images: A Training-Free Method for Alleviating Hallucination in LVLMs
Shi Liu*, Kecheng Zheng*, Wei Chen*
[pdf]
[DOI]

Inf-DiT: Upsampling any-resolution image with memory-efficient diffusion transformer.
Zhuoyi Yang*, Heyang Jiang, Wenyi Hong, Jiayan Teng, Wendi Zheng, Yuxiao Dong, Ming Ding, Jie Tang
[pdf]
[DOI]

Implicit Neural Models to Extract Heart Rate from Video
Pradyumna Chari*, Anirudh Bindiganavale Harish, Adnan Armouti, Alexander Vilesov, Sanjit Sarda, Laleh Jalilian, Achuta Kadambi
[pdf]
[DOI]

Boost Your NeRF: A Model-Agnostic Mixture of Experts Framework for High Quality and Efficient Rendering
Francesco Di Sario*, Riccardo Renzulli, Marco Grangetto, Enzo Tartaglione
[pdf]
[DOI]

PFGS: High Fidelity Point Cloud Rendering via Feature Splatting
Jiaxu Wang, Zhang Ziyi, Junhao He, Renjing Xu*
[pdf]
[DOI]

Few-Shot Anomaly-Driven Generation for Anomaly Classification and Segmentation
Guan Gui, Bin-Bin Gao*, Jun Liu, Chengjie Wang, Yunsheng Wu
[pdf]
[DOI]

E3M: Zero-Shot Spatio-Temporal Video Grounding with Expectation-Maximization Multimodal Modulation
Peijun Bao*, Zihao Shao, Wenhan Yang, Boon Poh Ng, Alex Kot
[pdf]
[DOI]

EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions
Linrui Tian*, Qi Wang*, Bang Zhang*, Liefeng Bo*
[pdf]
[DOI]

LMT-GP: Combined Latent Mean-Teacher and Gaussian Process for Semi-supervised Low-light Image Enhancement
Ye Yu, Fengxin Chen, Jun Yu*, Zhen Kan
[pdf]
[DOI]

"Veil Privacy on Visual Data: Concealing Privacy for Humans, Unveiling for DNNs"
Shuchao Pang*, Ruhao Ma, Bing Li*, Yongbin Zhou, Yazhou Yao
[pdf]
[DOI]

Efficient Vision Transformers with Partial Attention
Xuan-Thuy Vo*, Duy-Linh Nguyen, Adri Priadana, Kang-Hyun Jo*
[pdf]
[DOI]

Generalized Coverage for More Robust Low-Budget Active Learning
Wonho Bae, Junhyug Noh, Danica J. Sutherland*
[pdf]
[DOI]

Rasterized Edge Gradients: Handling Discontinuities Differentially
Stanislav Pidhorskyi*, Tomas Simon, Gabriel Schwartz, He Wen, Yaser Sheikh, Jason Saragih
[pdf]
[DOI]

Enhancing Cross-Subject fMRI-to-Video Decoding with Global-Local Functional Alignment
Chong Li*, Xuelin Qian, Yun Wang, Jingyang Huo, Xiangyang Xue*, Yanwei Fu*, Jianfeng Feng
[pdf]
[DOI]

FedTSA: A Cluster-based Two-Stage Aggregation Method for Model-heterogeneous Federated Learning
Boyu Fan*, Chenrui Wu, Xiang Su, Pan HUI
[pdf]
[DOI]

LLaVA-UHD: an LMM Perceiving any Aspect Ratio and High-Resolution Images
Zonghao Guo, Ruyi Xu, Yuan Yao*, Junbo Cui, Zanlin Ni, Chunjiang Ge, Tat-Seng Chua, Zhiyuan Liu, Gao Huang*
[pdf]
[DOI]

Learning Natural Consistency Representation for Face Forgery Video Detection
Daichi Zhang*, Zihao Xiao, Shikun Li, Fanzhao Lin, Jianmin Li, Shiming Ge*
[pdf]
[DOI]

ZeroI2V: Zero-Cost Adaptation of Pre-Trained Transformers from Image to Video
Xinhao Li, Yuhan Zhu, Limin Wang*
[pdf]
[DOI]

Zero-Shot Adaptation for Approximate Posterior Sampling of Diffusion Models in Inverse Problems
Yasar U Alcalar*, Mehmet Akcakaya
[pdf]
[DOI]

R.A.C.E.: Robust Adversarial Concept Erasure for Secure Text-to-Image Diffusion Model
Changhoon Kim*, Kyle Min*, Yezhou Yang
[pdf]
[DOI]

OpenSight: A Simple Open-Vocabulary Framework for LiDAR-Based Object Detection
Hu Zhang, xu jianhua, Tao Tang, Haiyang Sun, Xin Yu*, Zi Helen Huang*, Kaicheng Yu
[pdf]
[DOI]

Few-Shot Image Generation by Conditional Relaxing Diffusion Inversion
Yu Cao*, Shaogang Gong
[pdf]
[DOI]

Data Poisoning Quantization Backdoor Attack
Tran Huynh*, Anh Tran, Khoa Doan, Tung Pham
[pdf]
[DOI]

DailyDVS-200: A Comprehensive Benchmark Dataset for Event-Based Action Recognition
Qi Wang, Zhou Xu, Yuming Lin, Jingtao Ye, Hongsheng Li, Guangming Zhu, Syed Afaq Ali Shah, Mohammed Bennamoun, Liang Zhang*
[pdf]
[DOI]

On the Topology Awareness and Generalization Performance of Graph Neural Networks
Junwei Su*, Chuan Wu
[pdf]
[DOI]

T-CorresNet: Template Guided 3D Point Cloud Completion with Correspondence Pooling Query Generation Strategy
Fan Duan, Jiahao Yu, Li Chen*
[pdf]
[DOI]

A high-quality robust diffusion framework for corrupted dataset
Quan Dao*, Binh Ta, Tung Pham, Anh Tran
[pdf]
[DOI]

Efficient 3D-Aware Facial Image Editing via Attribute-Specific Prompt Learning
Amandeep Kumar*, Muhammad Awais, Sanath Narayan, Hisham Cholakkal, Salman Khan, Rao Muhammad Anwer
[pdf]
[DOI]

Distilling Knowledge from Large-Scale Image Models for Object Detection
Gang Li*, Wenhai Wang, Xiang Li, Ziheng Li, Jian Yang, Jifeng Dai, Yu Qiao, Shanshan Zhang*
[pdf]
[DOI]

Embracing Events and Frames with Hierarchical Feature Refinement Network for Object Detection
Hu Cao, Zehua Zhang, Yan Xia, Xinyi Li, Jiahao Xia, Guang Chen*, Alois C. Knoll
[pdf]
[DOI]

TimeLens-XL: Real-time Event-based Video Frame Interpolation with Large Motion
Shi Guo, Yutian Chen, Tianfan Xue, Jinwei Gu, Yongrui Ma*
[pdf]
[DOI]

Scene-Graph ViT: End-to-End Open-Vocabulary Visual Relationship Detection
Tim Salzmann, Markus Ryll, Alex Bewley, Matthias Minderer*
[pdf]
[DOI]

Self-Supervised Underwater Caustics Removal and Descattering via Deep Monocular SLAM
Jonathan Sauder*, Devis Tuia
[pdf]
[DOI]

Enriching Information and Preserving Semantic Consistency in Expanding Curvilinear Object Segmentation Datasets
Qin Lei*, Jiang Zhong, Qizhu Dai
[pdf]
[DOI]

Retrieval Robust to Object Motion Blur
Rong Zou, Marc Pollefeys, Denys Rozumnyi*
[pdf]
[DOI]

Unsupervised Representation Learning by Balanced Self Attention Matching
Daniel Shalam*, Simon Korman*
[pdf]
[DOI]

DualBEV: Unifying Dual View Transformation with Probabilistic Correspondences
Peidong Li*, Wancheng Shen, Qihao Huang, Dixiao Cui*
[pdf]
[DOI]

Identity-Consistent Diffusion Network for Grading Knee Osteoarthritis Progression in Radiographic Imaging
Wenhua Wu, Kun Hu*, Wenxi Yue, Wei Li, Milena Simic, Changyang Li, Wei Xiang, Zhiyong Wang
[pdf]
[DOI]

Learned Neural Physics Simulation for Articulated 3D Human Pose Reconstruction
Misha Andriluka*, Baruch Tabanpour, Daniel Freeman, Cristian Sminchisescu
[pdf]
[DOI]

Enhancing Source-Free Domain Adaptive Object Detection with Low-confidence Pseudo Label Distillation
Ilhoon Yoon, Hyeongjun Kwon, Jin Kim, Junyoung Park, Hyunsung Jang, Kwanghoon Sohn*
[pdf]
[DOI]

Fast Training of Diffusion Transformer with Extreme Masking for 3D Point Clouds Generation
Shentong Mo, Enze Xie*, Yue Wu, Junsong Chen, Matthias Niessner, Zhenguo Li
[pdf]
[DOI]

Make a Strong Teacher with Label Assistance: A Novel Knowledge Distillation Approach for Semantic Segmentation
Shoumeng Qiu, Jie Chen, Xinrun Li, Ru Wan, Xiangyang Xue, Jian Pu*
[pdf]
[DOI]

Make-Your-3D: Fast and Consistent Subject-Driven 3D Content Generation
Fangfu Liu, Hanyang Wang, Weiliang Chen, Haowen Sun, Yueqi Duan*
[pdf]
[DOI]

"Segment, Lift and Fit: Automatic 3D Shape Labeling from 2D Prompts"
Jianhao Li, Tianyu Sun, Zhongdao Wang*, Enze Xie, Bailan Feng, Hongbo Zhang, Ze Yuan, Ke Xu, Jiaheng Liu*, Ping Luo
[pdf]
[DOI]

SCOD: From Heuristics to Theory
Vojtech Franc*, Jakub Paplham*, Daniel Prusa*
[pdf]
[DOI]

Preventing Catastrophic Forgetting through Memory Networks in Continuous Detection
Gaurav Bhatt*, Leonid Sigal, James Ross
[pdf]
[DOI]

Improving Zero-shot Generalization of Learned Prompts via Unsupervised Knowledge Distillation
Marco Mistretta*, Alberto Baldrati, Marco Bertini, Andrew D. Bagdanov
[pdf]
[DOI]

Teach CLIP to Develop a Number Sense for Ordinal Regression
Yao DU*, Qiang Zhai, Weihang Dai, Xiaomeng Li*
[pdf]
[DOI]

Compact 3D Scene Representation via Self-Organizing Gaussian Grids
Wieland Morgenstern*, Florian Barthel, Anna Hilsmann, Peter Eisert
[pdf]
[DOI]

Pix2Gif: Motion-Guided Diffusion for GIF Generation
Hitesh Kandala*, Jianfeng Gao, Jianwei Yang
[pdf]
[DOI]

VETRA: A Dataset for Vehicle Tracking in Aerial Imagery - New Challenges for Multi-Object Tracking
Jens Hellekes*, Manuel Mühlhaus, Reza Bahmanyar, Seyed Majid Azimi, Franz Kurz
[pdf]
[DOI]

SelfGeo: Self-supervised and Geodesic-consistent Estimation of Keypoints on Deformable Shapes
Mohammad Zohaib*, Luca Cosmo, Alessio Del Bue
[pdf]
[DOI]

Beyond Prompt Learning: Continual Adapter for Efficient Rehearsal-Free Continual Learning
Xinyuan Gao, Songlin Dong, Yuhang He*, Qiang Wang, Yihong Gong
[pdf]
[DOI]

T2IShield: Defending Against Backdoors on Text-to-Image Diffusion Models
Zhongqi Wang, Jie Zhang*, Shiguang Shan, Xilin Chen
[pdf]
[DOI]

ExMatch: Self-guided Exploitation for Semi-Supervised Learning with Scarce Labeled Samples
Noo-ri Kim, Jin-Seop Lee, Jee-Hyong Lee*
[pdf]
[DOI]

Towards Certifiably Robust Face Recognition
Seunghun Paik, Dongsoo Kim, Chanwoo Hwang, Sunpill Kim, Jae Hong Seo*
[pdf]
[DOI]

Linking in Style: Understanding learned features in deep learning models
Maren Wehrheim*, Pamela Osuna Vargas, Matthias Kaschube
[pdf]
[DOI]

Stable Video Portraits
Mirela Ostrek*, Justus Thies
[pdf]
[DOI]

UDA-Bench: Revisiting Common Assumptions in Unsupervised Domain Adaptation Using a Standardized Framework
Tarun Kalluri*, Sreyas Ravichandran, Manmohan Chandraker
[pdf]
[DOI]

CliffPhys: Camera-based Respiratory Measurement using Clifford Neural Networks
Omar Ghezzi*, Giuseppe Boccignone, Giuliano Grossi, Raffaella Lanzarotti, Alessandro D'Amelio
[pdf]
[DOI]

Learned Rate Control for Frame-Level Adaptive Neural Video Compression via Dynamic Neural Network
Chenhao Zhang, Wei Gao*
[pdf]
[DOI]

PDiscoFormer: Relaxing Part Discovery Constraints with Vision Transformers
Ananthu Aniraj*, Cassio F. Dantas, Dino Ienco, Diego Marcos
[pdf]
[DOI]

Vision-Language Dual-Pattern Matching for Out-of-Distribution Detection
Zihan Zhang, Zhuo Xu, Xiang Xiang*
[pdf]
[DOI]

Synthesizing Environment-Specific People in Photographs
Mirela Ostrek*, Carol O'Sullivan, Michael J. Black, Justus Thies
[pdf]
[DOI]

Weight Conditioning for Smooth Optimization of Neural Networks
Hemanth Saratchandran*, Thomas X Wang, Simon Lucey
[pdf]
[DOI]

Energy-Clibrated VAE with Test Time Free Lunch
Yihong Luo, Siya Qiu, Xingjian Tao, Yujun Cai, Jing Tang*
[pdf]
[DOI]

MoEAD: A Parameter-efficient Model for Multi-class Anomaly Detection
Shiyuan Meng, Wenchao Meng*, Qihang Zhou, Shizhong Li, Weiye Hou, Shibo He
[pdf]
[DOI]

SceneTeller: Language-to-3D Scene Generation
Basak Melis Ocal*, Maxim Tatarchenko, Sezer Karaoglu, Theo Gevers
[pdf]
[DOI]

MagMax: Leveraging Model Merging for Seamless Continual Learning
Daniel Marczak*, Bartlomiej Twardowski*, Tomasz Trzcinski*, Sebastian Cygert*
[pdf]
[DOI]

InternVideo2: Scaling Foundation Models for Multimodal Video Understanding
Yi Wang*, Kunchang Li, Xinhao Li, Jiashuo Yu, Yinan He, Guo Chen, Baoqi Pei, Rongkun Zheng, Jilan Xu, Zun Wang, Yansong Shi, Tianxiang Jiang, SongZe Li, hongjie Zhang, Yifei Huang, Yu Qiao*, Yali Wang*, Limin Wang*
[pdf]
[DOI]

DiffusionPen: Towards Controlling the Style of Handwritten Text Generation
Konstantina Nikolaidou*, George Retsinas, Giorgos Sfikas, Marcus Liwicki
[pdf]
[DOI]

Debiasing surgeon: fantastic weights and how to find them
Remi Nahon, Ivan Luiz De Moura Matos, Van-Tam Nguyen, Enzo Tartaglione*
[pdf]
[DOI]

Denoising Vision Transformers
Jiawei Yang*, Katie Z Luo, Jiefeng Li, Congyue Deng, Leonidas Guibas, Dilip Krishnan, Kilian Weinberger, Yonglong Tian, Yue Wang
[pdf]
[DOI]

Differentiable Product Quantization for Memory Efficient Camera Relocalization
Zakaria Laskar*, Iaroslav Melekhov, Assia Benbihi, Shuzhe Wang, Juho Kannala
[pdf]
[DOI]

Spline-based Transformers
Prashanth Chandran*, Agon Serifi*, Markus Gross, Moritz Bächer
[pdf]
[DOI]

Learning Pseudo 3D Guidance for View-consistent Texturing with 2D Diffusion
Kehan Li, Yanbo Fan*, Yang Wu, Zhongqian Sun, Wei Yang, Xiangyang Ji, Li Yuan, Jie Chen*
[pdf]
[DOI]

TreeSBA: Tree-Transformer for Self-Supervised Sequential Brick Assembly
Mengqi Guo*, Chen Li, Yuyang Zhao, Gim Hee Lee
[pdf]
[DOI]

SparseRadNet: Sparse Perception Neural Network on Subsampled Radar Data
Jialong Wu*, Mirko Meuter, Markus Schoeler, Matthias Rottmann
[pdf]
[DOI]

Enhancing Semantic Fidelity in Text-to-Image Synthesis: Attention Regulation in Diffusion Models
Yang Zhang*, Tze Tzun Teoh, Wei Hern Lim, Kenji Kawaguchi
[pdf]
[DOI]

Adversarial Diffusion Distillation
Axel Sauer*, Dominik Lorenz, Andreas Blattmann, Robin Rombach
[pdf]
[DOI]

Fake It till You Make It: Curricular Dynamic Forgery Augmentations towards General Deepfake Detection
Yuzhen Lin*, Wentang Song, Bin Li*, Yuezun Li, Jiangqun Ni, Han Chen, Qiushi Li
[pdf]
[DOI]

Explain via Any Concept: Concept Bottleneck Model with Open Vocabulary Concepts
Andong Tan, Fengtao Zhou, Hao Chen*
[pdf]
[DOI]

Explore the Potential of CLIP for Training-Free Open Vocabulary Semantic Segmentation
Tong Shao, Zhuotao Tian*, Hang Zhao, Jingyong Su*
[pdf]
[DOI]

A Multimodal Benchmark Dataset and Model for Crop Disease Diagnosis
Xiang Liu, Zhaoxiang Liu*, Huan Hu, Zezhou Chen, Kohou Wang, Kai Wang, Shiguo Lian*
[pdf]
[DOI]

Missing Modality Prediction for Unpaired Multimodal Learning via Joint Embedding of Unimodal Models
Taesup Kim*, Donggeun Kim
[pdf]
[DOI]

Learning Where to Look: Self-supervised Viewpoint Selection for Active Localization using Geometrical Information
Luca Di Giammarino*, Boyang Sun, Giorgio Grisetti, Marc Pollefeys, Hermann Blum, Daniel Barath
[pdf]
[DOI]

Improving Diffusion Models for Authentic Virtual Try-on in the Wild
Yisol Choi*, Sangkyung Kwak, Kyungmin Lee, Hyungwon Choi, Jinwoo Shin*
[pdf]
[DOI]

Exploiting Semantic Reconstruction to Mitigate Hallucinations in Vision-Language Models
Minchan Kim, Minyeong Kim, Junik Bae, Suhwan Choi, Sungkyung Kim, Buru Chang*
[pdf]
[DOI]

LISO: Lidar-only Self-Supervised 3D Object Detection
Stefan Andreas Baur*, Frank Moosmann, Andreas Geiger
[pdf]
[DOI]

Text-Conditioned Resampler For Long Form Video Understanding
Bruno Korbar*, Yongqin Xian, Alessio Tonioni, Andrew Zisserman, Federico Tombari
[pdf]
[DOI]

Implicit Steganography Beyond the Constraints of Modality
Sojeong Song*, Seoyun Yang*, Chang D. Yoo*, Junmo Kim*
[pdf]
[DOI]

Using My Artistic Style? You Must Obtain My Authorization
Xiuli Bi, Haowei Liu, Weisheng Li, Bo Liu*, Bin Xiao
[pdf]
[DOI]

LookupViT: Compressing visual information to a limited number of tokens
Rajat Koner, Gagan Jain, Sujoy Paul*, Volker Tresp, Prateek Jain
[pdf]
[DOI]

Fast Diffusion-Based Counterfactuals for Shortcut Removal and Generation
Nina Weng*, Paraskevas Pegios, Eike Petersen, Aasa Feragen, Siavash Arjomand Bigdeli
[pdf]
[DOI]

UMERegRobust – Universal Manifold Embedding Compatible Features for Robust Point Cloud Registration
Yuval Haitman*, Amit Efraim, Joseph M Francos
[pdf]
[DOI]

Non-transferable Pruning
Ruyi Ding*, Lili Su, A. Adam Ding, Yunsi Fei
[pdf]
[DOI]

A Compact Dynamic 3D Gaussian Representation for Real-Time Dynamic View Synthesis
Kai Katsumata*, Duc Minh Vo, Hideki Nakayama
[pdf]
[DOI]

Fast Context-Based Low-Light Image Enhancement via Neural Implicit Representations
Tomáš Chobola*, Yu Liu, Hanyi Zhang, Julia A Schnabel, Tingying Peng*
[pdf]
[DOI]

Toward Open Vocabulary Aerial Object Detection with CLIP-Activated Student-Teacher Learning
Yan Li, Weiwei Guo*, Xue Yang, Ning Liao, Dunyun He, Jiaqi Zhou, Wenxian Yu*
[pdf]
[DOI]

Affine steerers for structured keypoint description
Georg Bökman*, Johan Edstedt, Michael Felsberg, Fredrik Kahl
[pdf]
[DOI]

Score Distillation Sampling with Learned Manifold Corrective
Thiemo Alldieck*, Nikos Kolotouros, Cristian Sminchisescu
[pdf]
[DOI]

FipTR: A Simple yet Effective Transformer Framework for Future Instance Prediction in Autonomous Driving
Xingtai Gui*, Tengteng Huang, Haonan Shao, Haotian Yao, Chi Zhang
[pdf]
[DOI]

Benchmarking the Robustness of Cross-view Geo-localization Models
Qingwang Zhang, Yingying Zhu*
[pdf]
[DOI]

GroCo: Ground Constraint for Metric Self-Supervised Monocular Depth
Aurélien Cecille*, Stefan Duffner, Franck Davoine, Thibault Neveu, Rémi Agier
[pdf]
[DOI]

SUMix: Mixup with Semantic and Uncertain Information
Huafeng Qin, Xin Jin*, Hongyu Zhu, Hongchao Liao, Mounim A. El Yacoubi, Xinbo Gao
[pdf]
[DOI]

Flatness-aware Sequential Learning Generates Resilient Backdoors
Hoang Pham*, The-Anh Ta, Anh T Tran, Khoa D Doan
[pdf]
[DOI]

Iterative Ensemble Training with Anti-Gradient Control for Mitigating Memorization in Diffusion Models
Xiao Liu, Xiaoliu Guan, Yu Wu*, Jiaxu Miao*
[pdf]
[DOI]

IFTR: An Instance-Level Fusion Transformer for Visual Collaborative Perception
Shaohong Wang, Lu Bin, Xinyu Xiao, Zhiyu Xiang, Hangguan Shan, Eryun Liu*
[pdf]
[DOI]

DiffClass: Diffusion-Based Class Incremental Learning
Zichong Meng, Jie Zhang, Changdi Yang, Zheng Zhan, Pu Zhao*, Yanzhi Wang*
[pdf]
[DOI]

Convex Relaxations for Manifold-Valued Markov Random Fields with Approximation Guarantees
Robin Kenis*, Emanuel Laude, Panagiotis Patrinos
[pdf]
[DOI]

Instant 3D Human Avatar Generation using Image Diffusion Models
Nikos Kolotouros*, Thiemo Alldieck, Enric Corona, Eduard Gabriel Bazavan, Cristian Sminchisescu
[pdf]
[DOI]

PromptFusion: Decoupling Stability and Plasticity for Continual Learning
Haoran Chen, Zuxuan Wu*, Xintong Han, Menglin Jia, Yu-Gang Jiang
[pdf]
[DOI]

Improving Geo-diversity of Generated Images with Contextualized Vendi Score Guidance
Reyhane Askari Hemmat*, Melissa Hall*, Alicia Yi Sun, Candace Ross, Michal Drozdzal, Adriana Romero-Soriano
[pdf]
[DOI]

Adapting to Shifting Correlations with Unlabeled Data Calibration
Minh Nguyen*, Alan Q Wang, Heejong Kim, Mert Sabuncu
[pdf]
[DOI]

Masked Generative Video-to-Audio Transformers with Enhanced Synchronicity
Santiago Pascual, Chunghsin YEH*, Ioannis Tsiamas, Joan Serrà
[pdf]
[DOI]

Information Bottleneck Based Data Correction in Continual Learning
Shuai Chen, mingyi zhang, Junge Zhang*, Kaiqi Huang*
[pdf]
[DOI]

On Spectral Properties of Gradient-based Explanation Methods
Amir Mehrpanah*, Erik Englesson, Hossein Azizpour
[pdf]
[DOI]

Contextual Correspondence Matters: Bidirectional Graph Matching for Video Summarization
Yunzuo Zhang*, Yameng Liu
[pdf]
[DOI]

O2V-Mapping: Online Open-Vocabulary Mapping with Neural Implicit Representation
Muer Tie, Julong Wei, Zhengjun Wang, Ke Wu, Shanshuai Yuan, Kaizhao Zhang, Jie Jia, Jieru Zhao, Zhongxue Gan*, Wenchao Ding*
[pdf]
[DOI]

Dataset Distillation by Automatic Training Trajectories
Dai Liu*, Jindong Gu*, Hu Cao, Carsten Trinitis, Martin Schulz*
[pdf]
[DOI]

FAFA: Frequency-Aware Flow-Aided Self-Supervision for Underwater Object Pose Estimation
Jingyi Tang*, Gu Wang, Zeyu Chen, Shengquan Li, Xiu Li*, Xiangyang Ji
[pdf]
[DOI]

EMIE-MAP: Large-Scale Road Surface Reconstruction Based on Explicit Mesh and Implicit Encoding
Wenhua Wu, Qi Wang, Guangming Wang, Junping Wang, Tiankun Zhao, Yang Liu, Dongchao Gao, Zhe Liu*, Hesheng Wang*
[pdf]
[DOI]

UniIR: Training and Benchmarking Universal Multimodal Information Retrievers
Cong Wei*, Yang Chen, Haonan Chen, Hexiang Hu, Ge Zhang, Jie Fu, Alan Ritter, Wenhu Chen
[pdf]
[DOI]

SSL-Cleanse: Trojan Detection and Mitigation in Self-Supervised Learning
Mengxin Zheng*, Jiaqi Xue, Zihao Wang, Xun Chen, Qian Lou, Lei Jiang, Xiaofeng Wang
[pdf]
[DOI]

Skews in the Phenomenon Space Hinder Generalization in Text-to-Image Generation
Yingshan Chang*, Yasi Zhang, Zhiyuan Fang, Ying Nian Wu, Yonatan Bisk, Feng Gao
[pdf]
[DOI]

Bones Can't Be Triangles: Accurate and Efficient Vertebrae Keypoint Estimation through Collaborative Error Revision
Jinhee Kim, Taesung Kim, Jaegul Choo*
[pdf]
[DOI]

latentSplat: Autoencoding Variational Gaussians for Fast Generalizable 3D Reconstruction
Christopher Wewer*, Kevin Raj, Eddy Ilg, Bernt Schiele, Jan E. Lenssen*
[pdf]
[DOI]

HyperSpaceX: Radial and Angular Exploration of HyperSpherical Dimensions
Chiranjeev Chiranjeev, Muskan Dosi, Kartik Thakral, Mayank Vatsa*, Richa Singh
[pdf]
[DOI]

InstructGIE: Towards Generalizable Image Editing
Zichong Meng, Changdi Yang, Jun Liu, Hao Tang*, Pu Zhao*, Yanzhi Wang*
[pdf]
[DOI]

HandDAGT: A Denoising Adaptive Graph Transformer for 3D Hand Pose Estimation
WENCAN CHENG, Eunji Kim, Jong Hwan Ko*
[pdf]
[DOI]

Navigating Text-to-Image Generative Bias across Indic Languages
Surbhi Mittal*, Arnav Sudan, Mayank Vatsa*, Richa Singh, Tamar Glaser, Tal Hassner
[pdf]
[DOI]

Correspondence-Free SE(3) Point Cloud Registration in RKHS via Unsupervised Equivariant Learning
Ray Zhang*, Zheming Zhou, Min Sun, Omid Ghasemalizadeh, Cheng-Hao Kuo, Ryan M. Eustice, Maani Ghaffari Jadidi, Arnie Sen
[pdf]
[DOI]

CTRLorALTer: Conditional LoRAdapter for Efficient 0-Shot Control & Altering of T2I Models
Nick Stracke*, Stefan Andreas Baumann, Joshua Susskind, Miguel Angel Bautista, Bjorn Ommer
[pdf]
[DOI]

Nickel and Diming Your GAN: A Dual-Method Approach to Enhancing GAN Efficiency via Knowledge Distillation
Sangyeop Yeo, Yoojin Jang, Jaejun Yoo*
[pdf]
[DOI]

VividDreamer: Invariant Score Distillation for Hyper-Realistic Text-to-3D Generation
Wenjie Zhuo*, Fan Ma, Hehe Fan, Yi Yang
[pdf]
[DOI]

"A Framework for Efficient Model Evaluation through Stratification, Sampling, and Estimation"
Riccardo Fogliato*, Pratik Patil, Mathew Monfort, Pietro Perona
[pdf]
[DOI]

Towards Scene Graph Anticipation
Rohith Peddi*, Saksham Singh, Saurabh ., Parag Singla, Vibhav Gogate
[pdf]
[DOI]

Non-Line-of-Sight Estimation of Fast Human Motion with Slow Scanning Imagers
Javier Grau Chopite*, Patrick Hähn, Matthias B Hullin*
[pdf]
[DOI]

Distributed Semantic Segmentation with Efficient Joint Source and Task Decoding
Danish Nazir*, Timo Bartels, Jan Piewek, Thorsten Bagdonat, Tim Fingscheidt
[pdf]
[DOI]

NePhi: Neural Deformation Fields for Approximately Diffeomorphic Medical Image Registration
Lin Tian*, Thomas H Greer, Raul San Jose Estepar, Roni Sengupta, Marc Niethammer
[pdf]
[DOI]

Aligning Neuronal Coding of Dynamic Visual Scenes with Foundation Vision Models
Rining Wu*, Feixiang Zhou, Ziwei Yin, Jian Liu*
[pdf]
[DOI]

Image Manipulation Detection With Implicit Neural Representation and Limited Supervision
Zhenfei Zhang*, Mingyang Li, Xin Li, Ming-Ching Chang, Jun-Wei Hsieh
[pdf]
[DOI]

Scalar Function Topology Divergence: Comparing Topology of 3D Objects
Ilya Trofimov*, Daria Voronkova, Eduard Tulchinskii, Evgeny Burnaev, Serguei Barannikov
[pdf]
[DOI]

Introducing Routing Functions to Vision-Language Parameter-Efficient Fine-Tuning with Low-Rank Bottlenecks
Tingyu Qu*, Tinne Tuytelaars, Marie-Francine Moens
[pdf]
[DOI]

Concept Arithmetics for Circumventing Concept Inhibition in Diffusion Models
Vitali Petsiuk*, Kate Saenko
[pdf]
[DOI]

DeTra: A Unified Model for Object Detection and Trajectory Forecasting
Sergio Casas*, Ben T Agro, Jiageng Mao, Thomas Gilles, ALEXANDER Y CUI, Enxu Li, Raquel Urtasun
[pdf]
[DOI]

ControlNet-XS: Rethinking the Control of Text-to-Image Diffusion Models as Feedback-Control Systems
Denis Zavadski*, Johann-Friedrich Feiden, Carsten Rother
[pdf]
[DOI]

Adaptive Bounding Box Uncertainties via Two-Step Conformal Prediction
Alexander Timans*, Christoph-Nikolas Straehle, Kaspar Sakmann, Eric Nalisnick
[pdf]
[DOI]

Common Sense Reasoning for Deep Fake Detection
Yue Zhang*, Ben Colman, Xiao Guo, Ali Shahriyari, Gaurav Bharaj*
[pdf]
[DOI]

Let the Avatar Talk using Texts without Paired Training Data
Xiuzhe Wu, Yang-Tian Sun, Handi Chen, Hang Zhou, Jingdong Wang, Zhengzhe Liu, Xiaojuan Qi*
[pdf]
[DOI]

NeRF-MAE: Masked AutoEncoders for Self-Supervised 3D Representation Learning for Neural Radiance Fields
Muhammad Zubair Irshad*, Sergey Zakharov, Vitor Guizilini, Adrien Gaidon, Zsolt Kira, Rares Ambrus
[pdf]
[DOI]

GOEmbed: Gradient Origin Embeddings for Representation Agnostic 3D Feature Learning
Animesh Karnewar*, Roman Shapovalov, Tom Monnier, Andrea Vedaldi, Niloy J. Mitra*, David Novotny*
[pdf]
[DOI]

Causal Subgraphs and Information Bottlenecks: Redefining OOD Robustness in Graph Neural Networks
Weizhi An, Wenliang Zhong, Feng Jiang, Hehuan Ma, Junzhou Huang*
[pdf]
[DOI]

AddBiomechanics Dataset: Capturing the Physics of Human Motion at Scale
Keenon Werling*, Janelle M Kaneda, Tian Tan, Rishi Agarwal, Six Skov, Tom Van Wouwe, Scott Uhlrich, Scott Delp, Karen Liu, Nicholas A Bianco, Carmichael Ong, Antoine Falisse, Shardul Sapkota, Aidan Jai Chandra, Joshua A Carter, Ezio Preatoni, Benjamin J Fregly, Jennifer Hicks
[pdf]
[DOI]

How to Train the Teacher Model for Effective Knowledge Distillation
Shayan Mohajer Hamidi*, Xizhen Deng, Renhao Tan, Linfeng Ye, Ahmed Hussein Salamah
[pdf]
[DOI]

Tight and Efficient Upper Bound on Spectral Norm of Convolutional Layers
Ekaterina Grishina*, Mikhail Gorbunov, Maxim Rakhuba
[pdf]
[DOI]

Deciphering the Role of Representation Disentanglement: Investigating Compositional Generalization in CLIP Models
Reza Abbasi, Mohammad Rohban, Mahdieh Soleymani Baghshah*
[pdf]
[DOI]

Modality Translation for Object Detection Adaptation without forgetting prior knowledge
Heitor Rapela Medeiros*, Masih Aminbeidokhti, Fidel A Guerrero Pena, David Latortue, Eric Granger, Marco Pedersoli
[pdf]
[DOI]

FroSSL: Frobenius Norm Minimization for Efficient Multiview Self-Supervised Learning
Oscar Skean*, Aayush Dhakal, Nathan Jacobs, Luis G Sanchez Giraldo
[pdf]
[DOI]

Learning Multimodal Latent Generative Models with Energy-Based Prior
Shiyu Yuan*, Jiali Cui, Hanao Li, Tian Han
[pdf]
[DOI]

On Learning Discriminative Features from Synthesized Data for Self-Supervised Fine-Grained Visual Recognition
Zihu Wang*, Lingqiao Liu, Scott Ricardo Figueroa Weston, Samuel Tian, Peng Li
[pdf]
[DOI]

LaWa: Using Latent Space for In-Generation Image Watermarking
Ahmad Rezaei*, Mohammad Akbari*, Saeed Ranjbar Alvar, Arezou Fatemi, Yong Zhang*
[pdf]
[DOI]

Hierarchical Conditioning of Diffusion Models Using Tree-of-Life for Studying Species Evolution
Mridul Khurana*, Arka Daw, M. Maruf, Josef C. Uyeda, Wasila Dahdul, Caleb Charpentier, Yasin Bakış, Henry L. Bart Jr., Paula M. Mabee, Hilmar Lapp, James P. Balhoff, Wei-Lun Chao, Charles Stewart, Tanya Berger-Wolf, Anuj Karpatne*
[pdf]
[DOI]

Markov Knowledge Distillation: Make Nasty Teachers trained by Self-undermining Knowledge Distillation Fully Distillable
En-hui Yang, Linfeng Ye*
[pdf]
[DOI]

Co-speech Gesture Video Generation with 3D Human Meshes
Aniruddha Mahapatra*, Richa Mishra*, Ziyi Chen, Boyang Ding, Renda Li, Shoulei Wang, Jun-Yan Zhu, Peng Chang, Mei Han, Jing Xiao
[pdf]
[DOI]

When and How do negative prompts take effect?
Yuanhao Ban, Ruochen Wang, Tianyi Zhou, Minhao Cheng, Boqing Gong, Cho-Jui Hsieh*
[pdf]
[DOI]

GS2Mesh: Surface Reconstruction from Gaussian Splatting via Novel Stereo Views
Yaniv Wolf*, Amit Bracha, Ron Kimmel
[pdf]
[DOI]

CARFF: Conditional Auto-encoded Radiance Field for 3D Scene Forecasting
Jiezhi Yang*, Khushi P Desai*, Charles Packer*, Harshil bhatia, Nicholas Rhinehart, Rowan McAllister, Joseph E Gonzalez*
[pdf]
[DOI]

Snuffy: Efficient Whole Slide Image Classifier
Hossein Jafarinia*, Alireza Alipanah, Saeed Razavi, Nahal Mirzaie, Mohammad Hossein Rohban*
[pdf]
[DOI]

Learning to Build by Building Your Own Instructions
Aaron T Walsman*, Muru Zhang, Adam Fishman, Ali Farhadi, Dieter Fox
[pdf]
[DOI]

Exploring Active Learning in Meta-Learning: Enhancing Context Set Labeling
Wonho Bae, Jing Wang, Danica J. Sutherland*
[pdf]
[DOI]

BlenderAlchemy: Editing 3D Graphics with Vision-Language Models
Ian Huang*, Guandao Yang, Leonidas Guibas
[pdf]
[DOI]

DεpS: Delayed ε-Shrinking for Faster Once-For-All Training
Aditya Annavajjala*, Alind Khare*, Animesh Agrawal, Igor Fedorov, Hugo M Latapie, Myungjin Lee, Alexey Tumanov
[pdf]
[DOI]

Learning Depth from Focus in the Wild
Changyeon Won, Hae-Gon Jeon
[pdf]
[DOI]

Learning-Based Point Cloud Registration for 6D Object Pose Estimation in the Real World
Zheng Dang, Lizhou Wang, Yu Guo, Mathieu Salzmann
[pdf]
[DOI]

An End-to-End Transformer Model for Crowd Localization
Dingkang Liang, Wei Xu, Xiang Bai
[pdf]
[DOI]

Few-Shot Single-View 3D Reconstruction with Memory Prior Contrastive Network
Zhen Xing, Yijiang Chen, Zhixin Ling, Xiangdong Zhou, Yu Xiang
[pdf]
[DOI]

DID-M3D: Decoupling Instance Depth for Monocular 3D Object Detection
Liang Peng, Xiaopei Wu, Zheng Yang, Haifeng Liu, Deng Cai
[pdf]
[DOI]

Adaptive Co-Teaching for Unsupervised Monocular Depth Estimation
Weisong Ren, Lijun Wang, Yongri Piao, Miao Zhang, Huchuan Lu, Ting Liu
[pdf]
[DOI]

Fusing Local Similarities for Retrieval-Based 3D Orientation Estimation of Unseen Objects
Chen Zhao, Yinlin Hu, Mathieu Salzmann
[pdf]
[DOI]

Lidar Point Cloud Guided Monocular 3D Object Detection
Liang Peng, Fei Liu, Zhengxu Yu, Senbo Yan, Dan Deng, Zheng Yang, Haifeng Liu, Deng Cai
[pdf]
[DOI]

Structural Causal 3D Reconstruction
Weiyang Liu, Zhen Liu, Liam Paull, Adrian Weller, Bernhard Schölkopf
[pdf]
[DOI]

3D Human Pose Estimation Using Möbius Graph Convolutional Networks
Niloofar Azizi, Horst Possegger, Emanuele Rodolà, Horst Bischof
[pdf]
[DOI]

Learning to Train a Point Cloud Reconstruction Network without Matching
Tianxin Huang, Xuemeng Yang, Jiangning Zhang, Jinhao Cui, Hao Zou, Jun Chen, Xiangrui Zhao, Yong Liu
[pdf]
[DOI]

PanoFormer: Panorama Transformer for Indoor 360° Depth Estimation
Zhijie Shen, Chunyu Lin, Kang Liao, Lang Nie, Zishuo Zheng, Yao Zhao
[pdf]
[DOI]

Self-supervised Human Mesh Recovery with Cross-Representation Alignment
Xuan Gong, Meng Zheng, Benjamin Planche, Srikrishna Karanam, Terrence Chen, David Doermann, Ziyan Wu
[pdf]
[DOI]

AlignSDF: Pose-Aligned Signed Distance Fields for Hand-Object Reconstruction
Zerui Chen, Yana Hasson, Cordelia Schmid, Ivan Laptev
[pdf]
[DOI]

A Reliable Online Method for Joint Estimation of Focal Length and Camera Rotation
Yiming Qian, James H. Elder
[pdf]
[DOI]

PS-NeRF: Neural Inverse Rendering for Multi-View Photometric Stereo
Wenqi Yang, Guanying Chen, Chaofeng Chen, Zhenfang Chen, Kwan-Yee K. Wong
[pdf]
[DOI]

Share with Thy Neighbors: Single-View Reconstruction by Cross-Instance Consistency
Tom Monnier, Matthew Fisher, Alexei A. Efros, Mathieu Aubry
[pdf]
[DOI]

Towards Comprehensive Representation Enhancement in Semantics-Guided Self-Supervised Monocular Depth Estimation
Jingyuan Ma, Xiangyu Lei, Nan Liu, Xian Zhao, Shiliang Pu
[pdf]
[DOI]

AvatarCap: Animatable Avatar Conditioned Monocular Human Volumetric Capture
Zhe Li, Zerong Zheng, Hongwen Zhang, Chaonan Ji, Yebin Liu
[pdf]
[DOI]

Cross-Attention of Disentangled Modalities for 3D Human Mesh Recovery with Transformers
Junhyeong Cho, Kim Youwang, Tae-Hyun Oh
[pdf]
[DOI]

GeoRefine: Self-Supervised Online Depth Refinement for Accurate Dense Mapping
Pan Ji, Qingan Yan, Yuxin Ma, Yi Xu
[pdf]
[DOI]

Multi-modal Masked Pre-training for Monocular Panoramic Depth Completion
Zhiqiang Yan, Xiang Li, Kun Wang, Zhenyu Zhang, Jun Li, Jian Yang
[pdf]
[DOI]

GitNet: Geometric Prior-Based Transformation for Birds-Eye-View Segmentation
Shi Gong, Xiaoqing Ye, Xiao Tan, Jingdong Wang, Errui Ding, Yu Zhou, Xiang Bai
[pdf]
[DOI]

Learning Visibility for Robust Dense Human Body Estimation
Chun-Han Yao, Jimei Yang, Duygu Ceylan, Yi Zhou, Yang Zhou, Ming-Hsuan Yang
[pdf]
[DOI]

Towards High-Fidelity Single-View Holistic Reconstruction of Indoor Scenes
Haolin Liu, Yujian Zheng, Guanying Chen, Shuguang Cui, Xiaoguang Han
[pdf]
[DOI]

CompNVS: Novel View Synthesis with Scene Completion
Zuoyue Li, Tianxing Fan, Zhenqiang Li, Zhaopeng Cui, Yoichi Sato, Marc Pollefeys, Martin R. Oswald
[pdf]
[DOI]

SketchSampler: Sketch-Based 3D Reconstruction via View-Dependent Depth Sampling
Chenjian Gao, Qian Yu, Lu Sheng, Yi-Zhe Song, Dong Xu
[pdf]
[DOI]

LocalBins: Improving Depth Estimation by Learning Local Distributions
Shariq Farooq Bhat, Ibraheem Alhashim, Peter Wonka
[pdf]
[DOI]

2D GANs Meet Unsupervised Single-View 3D Reconstruction
Feng Liu, Xiaoming Liu
[pdf]
[DOI]

InfiniteNature-Zero: Learning Perpetual View Generation of Natural Scenes from Single Images
Zhengqi Li, Qianqian Wang, Noah Snavely, Angjoo Kanazawa
[pdf]
[DOI]

Semi-Supervised Single-View 3D Reconstruction via Prototype Shape Priors
Zhen Xing, Hengduo Li, Zuxuan Wu, Yu-Gang Jiang
[pdf]
[DOI]

Bilateral Normal Integration
Xu Cao, Hiroaki Santo, Boxin Shi, Fumio Okura, Yasuyuki Matsushita
[pdf]
[DOI]

S2Contact: Graph-Based Network for 3D Hand-Object Contact Estimation with Semi-Supervised Learning
Tze Ho Elden Tse, Zhongqun Zhang, Kwang In Kim, Aleš Leonardis, Feng Zheng, Hyung Jin Chang
[pdf]
[DOI]

SC-wLS: Towards Interpretable Feed-Forward Camera Re-localization
Xin Wu, Hao Zhao, Shunkai Li, Yingdian Cao, Hongbin Zha
[pdf]
[DOI]

FloatingFusion: Depth from ToF and Image-Stabilized Stereo Cameras
Andreas Meuleman, Hakyeong Kim, James Tompkin, Min H. Kim
[pdf]
[DOI]

DELTAR: Depth Estimation from a Light-Weight ToF Sensor and RGB Image
Yijin Li, Xinyang Liu, Wenqi Dong, Han Zhou, Hujun Bao, Guofeng Zhang, Yinda Zhang, Zhaopeng Cui
[pdf]
[DOI]

3D Room Layout Estimation from a Cubemap of Panorama Image via Deep Manhattan Hough Transform
Yining Zhao, Chao Wen, Zhou Xue, Yue Gao
[pdf]
[DOI]

RBP-Pose: Residual Bounding Box Projection for Category-Level Pose Estimation
Ruida Zhang, Yan Di, Zhiqiang Lou, Fabian Manhardt, Federico Tombari, Xiangyang Ji
[pdf]
[DOI]

Monocular 3D Object Reconstruction with GAN Inversion
Junzhe Zhang, Daxuan Ren, Zhongang Cai, Chai Kiat Yeo, Bo Dai, Chen Change Loy
[pdf]
[DOI]

Map-Free Visual Relocalization: Metric Pose Relative to a Single Image
Eduardo Arnold, Jamie Wynn, Sara Vicente, Guillermo Garcia-Hernando, Aron Monszpart, Victor Prisacariu, Daniyar Turmukhambetov, Eric Brachmann
[pdf]
[DOI]

Self-Distilled Feature Aggregation for Self-Supervised Monocular Depth Estimation
Zhengming Zhou, Qiulei Dong
[pdf]
[DOI]

Planes vs. Chairs: Category-Guided 3D Shape Learning without Any 3D Cues
Zixuan Huang, Stefan Stojanov, Anh Thai, Varun Jampani, James M. Rehg
[pdf]
[DOI]

MHR-Net: Multiple-Hypothesis Reconstruction of Non-rigid Shapes from 2D Views
Haitian Zeng, Xin Yu, Jiaxu Miao, Yi Yang
[pdf]
[DOI]

Depth Map Decomposition for Monocular Depth Estimation
Jinyoung Jun, Jae-Han Lee, Chul Lee, Chang-Su Kim
[pdf]
[DOI]

Monitored Distillation for Positive Congruent Depth Completion
Tian Yu Liu, Parth Agrawal, Allison Chen, Byung-Woo Hong, Alex Wong
[pdf]
[DOI]

Resolution-Free Point Cloud Sampling Network with Data Distillation
Tianxin Huang, Jiangning Zhang, Jun Chen, Yuang Liu, Yong Liu
[pdf]
[DOI]

Organic Priors in Non-rigid Structure from Motion
Suryansh Kumar, Luc Van Gool
[pdf]
[DOI]

Perspective Flow Aggregation for Data-Limited 6D Object Pose Estimation
Yinlin Hu, Pascal Fua, Mathieu Salzmann
[pdf]
[DOI]

DANBO: Disentangled Articulated Neural Body Representations via Graph Neural Networks
Shih-Yang Su, Timur Bagautdinov, Helge Rhodin
[pdf]
[DOI]

"CHORE: Contact, Human and Object REconstruction from a Single RGB Image"
Xianghui Xie, Bharat Lal Bhatnagar, Gerard Pons-Moll
[pdf]
[DOI]

Learned Vertex Descent: A New Direction for 3D Human Model Fitting
Enric Corona, Gerard Pons-Moll, Guillem Alenyà, Francesc Moreno-Noguer
[pdf]
[DOI]

Self-Calibrating Photometric Stereo by Neural Inverse Rendering
Junxuan Li, Hongdong Li
[pdf]
[DOI]

3D Clothed Human Reconstruction in the Wild
Gyeongsik Moon, Hyeongjin Nam, Takaaki Shiratori, Kyoung Mu Lee
[pdf]
[DOI]

Directed Ray Distance Functions for 3D Scene Reconstruction
Nilesh Kulkarni, Justin Johnson, David F. Fouhey
[pdf]
[DOI]

Object Level Depth Reconstruction for Category Level 6D Object Pose Estimation from Monocular RGB Image
Zhaoxin Fan, Zhenbo Song, Jian Xu, Zhicheng Wang, Kejian Wu, Hongyan Liu, Jun He
[pdf]
[DOI]

Uncertainty Quantification in Depth Estimation via Constrained Ordinal Regression
Dongting Hu, Liuhua Peng, Tingjin Chu, Xiaoxing Zhang, Yinian Mao, Howard Bondell, Mingming Gong
[pdf]
[DOI]

CostDCNet: Cost Volume Based Depth Completion for a Single RGB-D Image
Jaewon Kam, Jungeon Kim, Soongjin Kim, Jaesik Park, Seungyong Lee
[pdf]
[DOI]

"ShAPO: Implicit Representations for Multi-Object Shape, Appearance, and Pose Optimization"
Muhammad Zubair Irshad, Sergey Zakharov, Rareș Ambruș, Thomas Kollar, Zsolt Kira, Adrien Gaidon
[pdf]
[DOI]

3D Siamese Transformer Network for Single Object Tracking on Point Clouds
Le Hui, Lingpeng Wang, Linghua Tang, Kaihao Lan, Jin Xie, Jian Yang
[pdf]
[DOI]

Object Wake-Up: 3D Object Rigging from a Single Image
Ji Yang, Xinxin Zuo, Sen Wang, Zhenbo Yu, Xingyu Li, Bingbing Ni, Minglun Gong, Li Cheng
[pdf]
[DOI]

IntegratedPIFu: Integrated Pixel Aligned Implicit Function for Single-View Human Reconstruction
Kennard Yanting Chan, Guosheng Lin, Haiyu Zhao, Weisi Lin
[pdf]
[DOI]

Realistic One-Shot Mesh-Based Head Avatars
Taras Khakhulin, Vanessa Sklyarova, Victor Lempitsky, Egor Zakharov
[pdf]
[DOI]

A Kendall Shape Space Approach to 3D Shape Estimation from 2D Landmarks
Martha Paskin, Daniel Baum, Mason N. Dean, Christoph von Tycowicz
[pdf]
[DOI]

Neural Light Field Estimation for Street Scenes with Differentiable Virtual Object Insertion
Zian Wang, Wenzheng Chen, David Acuna, Jan Kautz, Sanja Fidler
[pdf]
[DOI]

Perspective Phase Angle Model for Polarimetric 3D Reconstruction
Guangcheng Chen, Li He, Yisheng Guan, Hong Zhang
[pdf]
[DOI]

DeepShadow: Neural Shape from Shadow
Asaf Karnieli, Ohad Fried, Yacov Hel-Or
[pdf]
[DOI]

Camera Auto-Calibration from the Steiner Conic of the Fundamental Matrix
Yu Liu, Hui Zhang
[pdf]
[DOI]

Super-Resolution 3D Human Shape from a Single Low-Resolution Image
Marco Pesavento, Marco Volino, Adrian Hilton
[pdf]
[DOI]

Minimal Neural Atlas: Parameterizing Complex Surfaces with Minimal Charts and Distortion
Weng Fei Low, Gim Hee Lee
[pdf]
[DOI]

ExtrudeNet: Unsupervised Inverse Sketch-and-Extrude for Shape Parsing
Daxuan Ren, Jianmin Zheng, Jianfei Cai, Jiatong Li, Junzhe Zhang
[pdf]
[DOI]

CATRE: Iterative Point Clouds Alignment for Category-Level Object Pose Refinement
Xingyu Liu, Gu Wang, Yi Li, Xiangyang Ji
[pdf]
[DOI]

Optimization over Disentangled Encoding: Unsupervised Cross-Domain Point Cloud Completion via Occlusion Factor Manipulation
Jingyu Gong, Fengqi Liu, Jiachen Xu, Min Wang, Xin Tan, Zhizhong Zhang, Ran Yi, Haichuan Song, Yuan Xie, Lizhuang Ma
[pdf]
[DOI]

Unsupervised Learning of 3D Semantic Keypoints with Mutual Reconstruction
Haocheng Yuan, Chen Zhao, Shichao Fan, Jiaxi Jiang, Jiaqi Yang
[pdf]
[DOI]

MvDeCor: Multi-View Dense Correspondence Learning for Fine-Grained 3D Segmentation
Gopal Sharma, Kangxue Yin, Subhransu Maji, Evangelos Kalogerakis, Or Litany, Sanja Fidler
[pdf]
[DOI]

SUPR: A Sparse Unified Part-Based Human Representation
Ahmed A. A. Osman, Timo Bolkart, Dimitrios Tzionas, Michael J. Black
[pdf]
[DOI]

Revisiting Point Cloud Simplification: A Learnable Feature Preserving Approach
Rolandos Alexandros Potamias, Giorgos Bouritsas, Stefanos Zafeiriou
[pdf]
[DOI]

Masked Autoencoders for Point Cloud Self-Supervised Learning
Yatian Pang, Wenxiao Wang, Francis E.H. Tay, Wei Liu, Yonghong Tian, Li Yuan
[pdf]
[DOI]

Intrinsic Neural Fields: Learning Functions on Manifolds
Lukas Koestler, Daniel Grittner, Michael Moeller, Daniel Cremers, Zorah Lähner
[pdf]
[DOI]

Skeleton-Free Pose Transfer for Stylized 3D Characters
Zhouyingcheng Liao, Jimei Yang, Jun Saito, Gerard Pons-Moll, Yang Zhou
[pdf]
[DOI]

Masked Discrimination for Self-Supervised Learning on Point Clouds
Haotian Liu, Mu Cai, Yong Jae Lee
[pdf]
[DOI]

FBNet: Feedback Network for Point Cloud Completion
Xuejun Yan, Hongyu Yan, Jingjing Wang, Hang Du, Zhihong Wu, Di Xie, Shiliang Pu, Li Lu
[pdf]
[DOI]

Meta-Sampler: Almost-Universal yet Task-Oriented Sampling for Point Clouds
Ta-Ying Cheng, Qingyong Hu, Qian Xie, Niki Trigoni, Andrew Markham
[pdf]
[DOI]

A Level Set Theory for Neural Implicit Evolution under Explicit Flows
Ishit Mehta, Manmohan Chandraker, Ravi Ramamoorthi
[pdf]
[DOI]

Efficient Point Cloud Analysis Using Hilbert Curve
Wanli Chen, Xinge Zhu, Guojin Chen, Bei Yu
[pdf]
[DOI]

TOCH: Spatio-Temporal Object-to-Hand Correspondence for Motion Refinement
Keyang Zhou, Bharat Lal Bhatnagar, Jan Eric Lenssen, Gerard Pons-Moll
[pdf]
[DOI]

LaTeRF: Label and Text Driven Object Radiance Fields
Ashkan Mirzaei, Yash Kant, Jonathan Kelly, Igor Gilitschenski
[pdf]
[DOI]

MeshMAE: Masked Autoencoders for 3D Mesh Data Analysis
Yaqian Liang, Shanshan Zhao, Baosheng Yu, Jing Zhang, Fazhi He
[pdf]
[DOI]

Unsupervised Deep Multi-Shape Matching
Dongliang Cao, Florian Bernard
[pdf]
[DOI]

Texturify: Generating Textures on 3D Shape Surfaces
Yawar Siddiqui, Justus Thies, Fangchang Ma, Qi Shan, Matthias Nießner, Angela Dai
[pdf]
[DOI]

Autoregressive 3D Shape Generation via Canonical Mapping
An-Chieh Cheng, Xueting Li, Sifei Liu, Min Sun, Ming-Hsuan Yang
[pdf]
[DOI]

PointTree: Transformation-Robust Point Cloud Encoder with Relaxed K-D Trees
Jun-Kun Chen, Yu-Xiong Wang
[pdf]
[DOI]

UNIF: United Neural Implicit Functions for Clothed Human Reconstruction and Animation
Shenhan Qian, Jiale Xu, Ziwei Liu, Liqian Ma, Shenghua Gao
[pdf]
[DOI]

PRIF: Primary Ray-Based Implicit Function
Brandon Y. Feng, Yinda Zhang, Danhang Tang, Ruofei Du, Amitabh Varshney
[pdf]
[DOI]

Point Cloud Domain Adaptation via Masked Local 3D Structure Prediction
Hanxue Liang, Hehe Fan, Zhiwen Fan, Yi Wang, Tianlong Chen, Yu Cheng, Zhangyang Wang
[pdf]
[DOI]

CLIP-Actor: Text-Driven Recommendation and Stylization for Animating Human Meshes
Kim Youwang, Kim Ji-Yeon, Tae-Hyun Oh
[pdf]
[DOI]

PlaneFormers: From Sparse View Planes to 3D Reconstruction
Samir Agarwala, Linyi Jin, Chris Rockwell, David F. Fouhey
[pdf]
[DOI]

Learning Implicit Templates for Point-Based Clothed Human Modeling
Siyou Lin, Hongwen Zhang, Zerong Zheng, Ruizhi Shao, Yebin Liu
[pdf]
[DOI]

Exploring the Devil in Graph Spectral Domain for 3D Point Cloud Attacks
Qianjiang Hu, Daizong Liu, Wei Hu
[pdf]
[DOI]

Structure-Aware Editable Morphable Model for 3D Facial Detail Animation and Manipulation
Jingwang Ling, Zhibo Wang, Ming Lu, Quan Wang, Chen Qian, Feng Xu
[pdf]
[DOI]

MoFaNeRF: Morphable Facial Neural Radiance Field
Yiyu Zhuang, Hao Zhu, Xusen Sun, Xun Cao
[pdf]
[DOI]

PointInst3D: Segmenting 3D Instances by Points
Tong He, Wei Yin, Chunhua Shen, Anton van den Hengel
[pdf]
[DOI]

Cross-Modal 3D Shape Generation and Manipulation
Zezhou Cheng, Menglei Chai, Jian Ren, Hsin-Ying Lee, Kyle Olszewski, Zeng Huang, Subhransu Maji, Sergey Tulyakov
[pdf]
[DOI]

Latent Partition Implicit with Surface Codes for 3D Representation
Chao Chen, Yu-Shen Liu, Zhizhong Han
[pdf]
[DOI]

Implicit Field Supervision for Robust Non-rigid Shape Matching
Ramana Sundararaman, Gautam Pai, Maks Ovsjanikov
[pdf]
[DOI]

Learning Self-Prior for Mesh Denoising Using Dual Graph Convolutional Networks
Shota Hattori, Tatsuya Yatagawa, Yutaka Ohtake, Hiromasa Suzuki
[pdf]
[DOI]

diffConv: Analyzing Irregular Point Clouds with an Irregular View
Manxi Lin, Aasa Feragen
[pdf]
[DOI]

PD-Flow: A Point Cloud Denoising Framework with Normalizing Flows
Aihua Mao, Zihui Du, Yu-Hui Wen, Jun Xuan, Yong-Jin Liu
[pdf]
[DOI]

SeedFormer: Patch Seeds Based Point Cloud Completion with Upsample Transformer
Haoran Zhou, Yun Cao, Wenqing Chu, Junwei Zhu, Tong Lu, Ying Tai, Chengjie Wang
[pdf]
[DOI]

DeepMend: Learning Occupancy Functions to Represent Shape for Repair
Nikolas Lamb, Sean Banerjee, Natasha Kholgade Banerjee
[pdf]
[DOI]

A Repulsive Force Unit for Garment Collision Handling in Neural Networks
Qingyang Tan, Yi Zhou, Tuanfeng Wang, Duygu Ceylan, Xin Sun, Dinesh Manocha
[pdf]
[DOI]

Shape-Pose Disentanglement Using SE(3)-Equivariant Vector Neurons
Oren Katzir, Dani Lischinski, Daniel Cohen-Or
[pdf]
[DOI]

3D Equivariant Graph Implicit Functions
Yunlu Chen, Basura Fernando, Hakan Bilen, Matthias Nießner, Efstratios Gavves
[pdf]
[DOI]

PatchRD: Detail-Preserving Shape Completion by Learning Patch Retrieval and Deformation
Bo Sun, Vladimir G. Kim, Noam Aigerman, Qixing Huang, Siddhartha Chaudhuri
[pdf]
[DOI]

3D Shape Sequence of Human Comparison and Classification Using Current and Varifolds
Emery Pierson, Mohamed Daoudi, Sylvain Arguillere
[pdf]
[DOI]

Conditional-Flow NeRF: Accurate 3D Modelling with Reliable Uncertainty Quantification
Jianxiong Shen, Antonio Agudo, Francesc Moreno-Noguer, Adria Ruiz
[pdf]
[DOI]

Unsupervised Pose-Aware Part Decomposition for Man-Made Articulated Objects
Yuki Kawana, Yusuke Mukuta, Tatsuya Harada
[pdf]
[DOI]

MeshUDF: Fast and Differentiable Meshing of Unsigned Distance Field Networks
Benoît Guillard, Federico Stella, Pascal Fua
[pdf]
[DOI]

SPE-Net: Boosting Point Cloud Analysis via Rotation Robustness Enhancement
Zhaofan Qiu, Yehao Li, Yu Wang, Yingwei Pan, Ting Yao, Tao Mei
[pdf]
[DOI]

The Shape Part Slot Machine: Contact-Based Reasoning for Generating 3D Shapes from Parts
Kai Wang, Paul Guerrero, Vladimir G. Kim, Siddhartha Chaudhuri, Minhyuk Sung, Daniel Ritchie
[pdf]
[DOI]

Spatiotemporal Self-Attention Modeling with Temporal Patch Shift for Action Recognition
Wangmeng Xiang, Chao Li, Biao Wang, Xihan Wei, Xian-Sheng Hua, Lei Zhang
[pdf]
[DOI]

Proposal-Free Temporal Action Detection via Global Segmentation Mask Learning
Sauradip Nag, Xiatian Zhu, Yi-Zhe Song, Tao Xiang
[pdf]
[DOI]

Semi-Supervised Temporal Action Detection with Proposal-Free Masking
Sauradip Nag, Xiatian Zhu, Yi-Zhe Song, Tao Xiang
[pdf]
[DOI]

Zero-Shot Temporal Action Detection via Vision-Language Prompting
Sauradip Nag, Xiatian Zhu, Yi-Zhe Song, Tao Xiang
[pdf]
[DOI]

CycDA: Unsupervised Cycle Domain Adaptation to Learn from Image to Video
Wei Lin, Anna Kukleva, Kunyang Sun, Horst Possegger, Hilde Kuehne, Horst Bischof
[pdf]
[DOI]

S2N: Suppression-Strengthen Network for Event-Based Recognition under Variant Illuminations
Zengyu Wan, Yang Wang, Ganchao Tan, Yang Cao, Zheng-Jun Zha
[pdf]
[DOI]

CMD: Self-Supervised 3D Action Representation Learning with Cross-Modal Mutual Distillation
Yunyao Mao, Wengang Zhou, Zhenbo Lu, Jiajun Deng, Houqiang Li
[pdf]
[DOI]

Expanding Language-Image Pretrained Models for General Video Recognition
Bolin Ni, Houwen Peng, Minghao Chen, Songyang Zhang, Gaofeng Meng, Jianlong Fu, Shiming Xiang, Haibin Ling
[pdf]
[DOI]

Hunting Group Clues with Transformers for Social Group Activity Recognition
Masato Tamura, Rahul Vishwakarma, Ravigopal Vennelakanti
[pdf]
[DOI]

Contrastive Positive Mining for Unsupervised 3D Action Representation Learning
Haoyuan Zhang, Yonghong Hou, Wenjing Zhang, Wanqing Li
[pdf]
[DOI]

Target-Absent Human Attention
Zhibo Yang, Sounak Mondal, Seoyoung Ahn, Gregory Zelinsky, Minh Hoai, Dimitris Samaras
[pdf]
[DOI]

Uncertainty-Based Spatial-Temporal Attention for Online Action Detection
Hongji Guo, Zhou Ren, Yi Wu, Gang Hua, Qiang Ji
[pdf]
[DOI]

Iwin: Human-Object Interaction Detection via Transformer with Irregular Windows
Danyang Tu, Xiongkuo Min, Huiyu Duan, Guodong Guo, Guangtao Zhai, Wei Shen
[pdf]
[DOI]

Rethinking Zero-Shot Action Recognition: Learning from Latent Atomic Actions
Yijun Qian, Lijun Yu, Wenhe Liu, Alexander G. Hauptmann
[pdf]
[DOI]

Mining Cross-Person Cues for Body-Part Interactiveness Learning in HOI Detection
Xiaoqian Wu, Yong-Lu Li, Xinpeng Liu, Junyi Zhang, Yuzhe Wu, Cewu Lu
[pdf]
[DOI]

Collaborating Domain-Shared and Target-Specific Feature Clustering for Cross-Domain 3D Action Recognition
Qinying Liu, Zilei Wang
[pdf]
[DOI]

Is Appearance Free Action Recognition Possible?
Filip Ilic, Thomas Pock, Richard P. Wildes
[pdf]
[DOI]

Learning Spatial-Preserved Skeleton Representations for Few-Shot Action Recognition
Ning Ma, Hongyi Zhang, Xuhui Li, Sheng Zhou, Zhen Zhang, Jun Wen, Haifeng Li, Jingjun Gu, Jiajun Bu
[pdf]
[DOI]

Dual-Evidential Learning for Weakly-Supervised Temporal Action Localization
Mengyuan Chen, Junyu Gao, Shicai Yang, Changsheng Xu
[pdf]
[DOI]

Global-Local Motion Transformer for Unsupervised Skeleton-Based Action Learning
Boeun Kim, Hyung Jin Chang, Jungho Kim, Jin Young Choi
[pdf]
[DOI]

AdaFocusV3: On Unified Spatial-Temporal Dynamic Video Recognition
Yulin Wang, Yang Yue, Xinhong Xu, Ali Hassani, Victor Kulikov, Nikita Orlov, Shiji Song, Humphrey Shi, Gao Huang
[pdf]
[DOI]

Panoramic Human Activity Recognition
Ruize Han, Haomin Yan, Jiacheng Li, Songmiao Wang, Wei Feng, Song Wang
[pdf]
[DOI]

Delving into Details: Synopsis-to-Detail Networks for Video Recognition
Shuxian Liang, Xu Shen, Jianqiang Huang, Xian-Sheng Hua
[pdf]
[DOI]

A Generalized & Robust Framework for Timestamp Supervision in Temporal Action Segmentation
Rahul Rahaman, Dipika Singhania, Alexandre Thiery, Angela Yao
[pdf]
[DOI]

Few-Shot Action Recognition with Hierarchical Matching and Contrastive Learning
Sipeng Zheng, Shizhe Chen, Qin Jin
[pdf]
[DOI]

PrivHAR: Recognizing Human Actions from Privacy-Preserving Lens
Carlos Hinojosa, Miguel Marquez, Henry Arguello, Ehsan Adeli, Li Fei-Fei, Juan Carlos Niebles
[pdf]
[DOI]

Scale-Aware Spatio-Temporal Relation Learning for Video Anomaly Detection
Guoqiu Li, Guanxiong Cai, Xingyu Zeng, Rui Zhao
[pdf]
[DOI]

Compound Prototype Matching for Few-Shot Action Recognition
Yifei Huang, Lijin Yang, Yoichi Sato
[pdf]
[DOI]

Continual 3D Convolutional Neural Networks for Real-Time Processing of Videos
Lukas Hedegaard, Alexandros Iosifidis
[pdf]
[DOI]

Dynamic Spatio-Temporal Specialization Learning for Fine-Grained Action Recognition
Tianjiao Li, Lin Geng Foo, Qiuhong Ke, Hossein Rahmani, Anran Wang, Jinghua Wang, Jun Liu
[pdf]
[DOI]

Dynamic Local Aggregation Network with Adaptive Clusterer for Anomaly Detection
Zhiwei Yang, Peng Wu, Jing Liu, Xiaotao Liu
[pdf]
[DOI]

Action Quality Assessment with Temporal Parsing Transformer
Yang Bai, Desen Zhou, Songyang Zhang, Jian Wang, Errui Ding, Yu Guan, Yang Long, Jingdong Wang
[pdf]
[DOI]

Entry-Flipped Transformer for Inference and Prediction of Participant Behavior
Bo Hu, Tat-Jen Cham
[pdf]
[DOI]

Pairwise Contrastive Learning Network for Action Quality Assessment
Mingzhe Li, Hong-Bo Zhang, Qing Lei, Zongwen Fan, Jinghua Liu, Ji-Xiang Du
[pdf]
[DOI]

Geometric Features Informed Multi-Person Human-Object Interaction Recognition in Videos
Tanqiu Qiao, Qianhui Men, Frederick W. B. Li, Yoshiki Kubotani, Shigeo Morishima, Hubert P. H. Shum
[pdf]
[DOI]

ActionFormer: Localizing Moments of Actions with Transformers
Chen-Lin Zhang, Jianxin Wu, Yin Li
[pdf]
[DOI]

SocialVAE: Human Trajectory Prediction Using Timewise Latents
Pei Xu, Jean-Bernard Hayet, Ioannis Karamouzas
[pdf]
[DOI]

Shape Matters: Deformable Patch Attack
Zhaoyu Chen, Bo Li, Shuang Wu, Jianghe Xu, Shouhong Ding, Wenqiang Zhang
[pdf]
[DOI]

Frequency Domain Model Augmentation for Adversarial Attack
Yuyang Long, Qilong Zhang, Boheng Zeng, Lianli Gao, Xianglong Liu, Jian Zhang, Jingkuan Song
[pdf]
[DOI]

Prior-Guided Adversarial Initialization for Fast Adversarial Training
Xiaojun Jia, Yong Zhang, Xingxing Wei, Baoyuan Wu, Ke Ma, Jue Wang, Xiaochun Cao
[pdf]
[DOI]

Enhanced Accuracy and Robustness via Multi-Teacher Adversarial Distillation
Shiji Zhao, Jie Yu, Zhenlong Sun, Bo Zhang, Xingxing Wei
[pdf]
[DOI]

LGV: Boosting Adversarial Example Transferability from Large Geometric Vicinity
Martin Gubri, Maxime Cordy, Mike Papadakis, Yves Le Traon, Koushik Sen
[pdf]
[DOI]

A Large-Scale Multiple-Objective Method for Black-Box Attack against Object Detection
Siyuan Liang, Longkang Li, Yanbo Fan, Xiaojun Jia, Jingzhi Li, Baoyuan Wu, Xiaochun Cao
[pdf]
[DOI]

GradAuto: Energy-Oriented Attack on Dynamic Neural Networks
Jianhong Pan, Qichen Zheng, Zhipeng Fan, Hossein Rahmani, Qiuhong Ke, Jun Liu
[pdf]
[DOI]

A Spectral View of Randomized Smoothing under Common Corruptions: Benchmarking and Improving Certified Robustness
Jiachen Sun, Akshay Mehra, Bhavya Kailkhura, Pin-Yu Chen, Dan Hendrycks, Jihun Hamm, Z. Morley Mao
[pdf]
[DOI]

Improving Adversarial Robustness of 3D Point Cloud Classification Models
Guanlin Li, Guowen Xu, Han Qiu, Ruan He, Jiwei Li, Tianwei Zhang
[pdf]
[DOI]

Learning Extremely Lightweight and Robust Model with Differentiable Constraints on Sparsity and Condition Number
Xian Wei, Yangyu Xu, Yanhui Huang, Hairong Lv, Hai Lan, Mingsong Chen, Xuan Tang
[pdf]
[DOI]

RIBAC: Towards Robust and Imperceptible Backdoor Attack against Compact DNN
Huy Phan, Cong Shi, Yi Xie, Tianfang Zhang, Zhuohang Li, Tianming Zhao, Jian Liu, Yan Wang, Yingying Chen, Bo Yuan
[pdf]
[DOI]

Boosting Transferability of Targeted Adversarial Examples via Hierarchical Generative Networks
Xiao Yang, Yinpeng Dong, Tianyu Pang, Hang Su, Jun Zhu
[pdf]
[DOI]

Adaptive Image Transformations for Transfer-Based Adversarial Attack
Zheng Yuan, Jie Zhang, Shiguang Shan
[pdf]
[DOI]

Generative Multiplane Images: Making a 2D GAN 3D-Aware
Xiaoming Zhao, Fangchang Ma, David Güera, Zhile Ren, Alexander G. Schwing, Alex Colburn
[pdf]
[DOI]

AdvDO: Realistic Adversarial Attacks for Trajectory Prediction
Yulong Cao, Chaowei Xiao, Anima Anandkumar, Danfei Xu, Marco Pavone
[pdf]
[DOI]

Adversarial Contrastive Learning via Asymmetric InfoNCE
Qiying Yu, Jieming Lou, Xianyuan Zhan, Qizhang Li, Wangmeng Zuo, Yang Liu, Jingjing Liu
[pdf]
[DOI]

One Size Does NOT Fit All: Data-Adaptive Adversarial Training
Shuo Yang, Chang Xu
[pdf]
[DOI]

UniCR: Universally Approximated Certified Robustness via Randomized Smoothing
Hanbin Hong, Binghui Wang, Yuan Hong
[pdf]
[DOI]

Hardly Perceptible Trojan Attack against Neural Networks with Bit Flips
Jiawang Bai, Kuofeng Gao, Dihong Gong, Shu-Tao Xia, Zhifeng Li, Wei Liu
[pdf]
[DOI]

Robust Network Architecture Search via Feature Distortion Restraining
Yaguan Qian, Shenghui Huang, Bin Wang, Xiang Ling, Xiaohui Guan, Zhaoquan Gu, Shaoning Zeng, Wujie Zhou, Haijiang Wang
[pdf]
[DOI]

SecretGen: Privacy Recovery on Pre-trained Models via Distribution Discrimination
Zhuowen Yuan, Fan Wu, Yunhui Long, Chaowei Xiao, Bo Li
[pdf]
[DOI]

Triangle Attack: A Query-Efficient Decision-Based Adversarial Attack
Xiaosen Wang, Zeliang Zhang, Kangheng Tong, Dihong Gong, Kun He, Zhifeng Li, Wei Liu
[pdf]
[DOI]

Data-Free Backdoor Removal Based on Channel Lipschitzness
Runkai Zheng, Rongjun Tang, Jianze Li, Li Liu
[pdf]
[DOI]

Black-Box Dissector: Towards Erasing-Based Hard-Label Model Stealing Attack
Yixu Wang, Jie Li, Hong Liu, Yan Wang, Yongjian Wu, Feiyue Huang, Rongrong Ji
[pdf]
[DOI]

Learning Energy-Based Models with Adversarial Training
Xuwang Yin, Shiying Li, Gustavo K. Rohde
[pdf]
[DOI]

Adversarial Label Poisoning Attack on Graph Neural Networks via Label Propagation
Ganlin Liu, Xiaowei Huang, Xinping Yi
[pdf]
[DOI]

Revisiting Outer Optimization in Adversarial Training
Ali Dabouei, Fariborz Taherkhani, Sobhan Soleymani, Nasser M. Nasrabadi
[pdf]
[DOI]

Zero-Shot Attribute Attacks on Fine-Grained Recognition Models
Nasim Shafiee, Ehsan Elhamifar
[pdf]
[DOI]

Towards Effective and Robust Neural Trojan Defenses via Input Filtering
Kien Do, Haripriya Harikumar, Hung Le, Dung Nguyen, Truyen Tran, Santu Rana, Dang Nguyen, Willy Susilo, Svetha Venkatesh
[pdf]
[DOI]

Scaling Adversarial Training to Large Perturbation Bounds
Sravanti Addepalli, Samyak Jain, Gaurang Sriramanan, R. Venkatesh Babu
[pdf]
[DOI]

Exploiting the Local Parabolic Landscapes of Adversarial Losses to Accelerate Black-Box Adversarial Attack
Hoang Tran, Dan Lu, Guannan Zhang
[pdf]
[DOI]

Generative Domain Adaptation for Face Anti-Spoofing
Qianyu Zhou, Ke-Yue Zhang, Taiping Yao, Ran Yi, Kekai Sheng, Shouhong Ding, Lizhuang Ma
[pdf]
[DOI]

MetaGait: Learning to Learn an Omni Sample Adaptive Representation for Gait Recognition
Huanzhang Dou, Pengyi Zhang, Wei Su, Yunlong Yu, Xi Li
[pdf]
[DOI]

GaitEdge: Beyond Plain End-to-End Gait Recognition for Better Practicality
Junhao Liang, Chao Fan, Saihui Hou, Chuanfu Shen, Yongzhen Huang, Shiqi Yu
[pdf]
[DOI]

UIA-ViT: Unsupervised Inconsistency-Aware Method Based on Vision Transformer for Face Forgery Detection
Wanyi Zhuang, Qi Chu, Zhentao Tan, Qiankun Liu, Haojie Yuan, Changtao Miao, Zixiang Luo, Nenghai Yu
[pdf]
[DOI]

Effective Presentation Attack Detection Driven by Face Related Task
Wentian Zhang, Haozhe Liu, Feng Liu, Raghavendra Ramachandra, Christoph Busch
[pdf]
[DOI]

PPT: Token-Pruned Pose Transformer for Monocular and Multi-View Human Pose Estimation
Haoyu Ma, Zhe Wang, Yifei Chen, Deying Kong, Liangjian Chen, Xingwei Liu, Xiangyi Yan, Hao Tang, Xiaohui Xie
[pdf]
[DOI]

AvatarPoser: Articulated Full-Body Pose Tracking from Sparse Motion Sensing
Jiaxi Jiang, Paul Streli, Huajian Qiu, Andreas Fender, Larissa Laich, Patrick Snape, Christian Holz
[pdf]
[DOI]

P-STMO: Pre-trained Spatial Temporal Many-to-One Model for 3D Human Pose Estimation
Wenkang Shan, Zhenhua Liu, Xinfeng Zhang, Shanshe Wang, Siwei Ma, Wen Gao
[pdf]
[DOI]

D&D: Learning Human Dynamics from Dynamic Camera
Jiefeng Li, Siyuan Bian, Chao Xu, Gang Liu, Gang Yu, Cewu Lu
[pdf]
[DOI]

Explicit Occlusion Reasoning for Multi-Person 3D Human Pose Estimation
Qihao Liu, Yi Zhang, Song Bai, Alan Yuille
[pdf]
[DOI]

COUCH: Towards Controllable Human-Chair Interactions
Xiaohan Zhang, Bharat Lal Bhatnagar, Sebastian Starke, Vladimir Guzov, Gerard Pons-Moll
[pdf]
[DOI]

Identity-Aware Hand Mesh Estimation and Personalization from RGB Images
Deying Kong, Linguang Zhang, Liangjian Chen, Haoyu Ma, Xiangyi Yan, Shanlin Sun, Xingwei Liu, Kun Han, Xiaohui Xie
[pdf]
[DOI]

C3P: Cross-Domain Pose Prior Propagation for Weakly Supervised 3D Human Pose Estimation
Cunlin Wu, Yang Xiao, Boshen Zhang, Mingyang Zhang, Zhiguo Cao, Joey Tianyi Zhou
[pdf]
[DOI]

Pose-NDF: Modeling Human Pose Manifolds with Neural Distance Fields
Garvita Tiwari, Dimitrije Antić, Jan Eric Lenssen, Nikolaos Sarafianos, Tony Tung, Gerard Pons-Moll
[pdf]
[DOI]

CLIFF: Carrying Location Information in Full Frames into Human Pose and Shape Estimation
Zhihao Li, Jianzhuang Liu, Zhensong Zhang, Songcen Xu, Youliang Yan
[pdf]
[DOI]

DeciWatch: A Simple Baseline for 10× Efficient 2D and 3D Pose Estimation
Ailing Zeng, Xuan Ju, Lei Yang, Ruiyuan Gao, Xizhou Zhu, Bo Dai, Qiang Xu
[pdf]
[DOI]

SmoothNet: A Plug-and-Play Network for Refining Human Poses in Videos
Ailing Zeng, Lei Yang, Xuan Ju, Jiefeng Li, Jianyi Wang, Qiang Xu
[pdf]
[DOI]

PoseTrans: A Simple yet Effective Pose Transformation Augmentation for Human Pose Estimation
Wentao Jiang, Sheng Jin, Wentao Liu, Chen Qian, Ping Luo, Si Liu
[pdf]
[DOI]

Multi-Person 3D Pose and Shape Estimation via Inverse Kinematics and Refinement
Junuk Cha, Muhammad Saqlain, GeonU Kim, Mingyu Shin, Seungryul Baek
[pdf]
[DOI]

Overlooked Poses Actually Make Sense: Distilling Privileged Knowledge for Human Motion Prediction
Xiaoning Sun, Qiongjie Cui, Huaijiang Sun, Bin Li, Weiqing Li, Jianfeng Lu
[pdf]
[DOI]

Structural Triangulation: A Closed-Form Solution to Constrained 3D Human Pose Estimation
Zhuo Chen, Xu Zhao, Xiaoyue Wan
[pdf]
[DOI]

Audio-Driven Stylized Gesture Generation with Flow-Based Model
Sheng Ye, Yu-Hui Wen, Yanan Sun, Ying He, Ziyang Zhang, Yaoyuan Wang, Weihua He, Yong-Jin Liu
[pdf]
[DOI]

Self-Constrained Inference Optimization on Structural Groups for Human Pose Estimation
Zhehan Kan, Shuoshuo Chen, Zeng Li, Zhihai He
[pdf]
[DOI]

UnrealEgo: A New Dataset for Robust Egocentric 3D Human Motion Capture
Hiroyasu Akada, Jian Wang, Soshi Shimada, Masaki Takahashi, Christian Theobalt, Vladislav Golyanik
[pdf]
[DOI]

Skeleton-Parted Graph Scattering Networks for 3D Human Motion Prediction
Maosen Li, Siheng Chen, Zijing Zhang, Lingxi Xie, Qi Tian, Ya Zhang
[pdf]
[DOI]

Rethinking Keypoint Representations: Modeling Keypoints and Poses as Objects for Multi-Person Human Pose Estimation
William McNally, Kanav Vats, Alexander Wong, John McPhee
[pdf]
[DOI]

VirtualPose: Learning Generalizable 3D Human Pose Models from Virtual Data
Jiajun Su, Chunyu Wang, Xiaoxuan Ma, Wenjun Zeng, Yizhou Wang
[pdf]
[DOI]

Poseur: Direct Human Pose Regression with Transformers
Weian Mao, Yongtao Ge, Chunhua Shen, Zhi Tian, Xinlong Wang, Zhibin Wang, Anton van den Hengel
[pdf]
[DOI]

SimCC: A Simple Coordinate Classification Perspective for Human Pose Estimation
Yanjie Li, Sen Yang, Peidong Liu, Shoukui Zhang, Yunxiao Wang, Zhicheng Wang, Wankou Yang, Shu-Tao Xia
[pdf]
[DOI]

Regularizing Vector Embedding in Bottom-Up Human Pose Estimation
Haixin Wang, Lu Zhou, Yingying Chen, Ming Tang, Jinqiao Wang
[pdf]
[DOI]

A Visual Navigation Perspective for Category-Level Object Pose Estimation
Jiaxin Guo, Fangxun Zhong, Rong Xiong, Yun-Hui Liu, Yue Wang, Yiyi Liao
[pdf]
[DOI]

Faster VoxelPose: Real-Time 3D Human Pose Estimation by Orthographic Projection
Hang Ye, Wentao Zhu, Chunyu Wang, Rujie Wu, Yizhou Wang
[pdf]
[DOI]

Learning to Fit Morphable Models
Vasileios Choutas, Federica Bogo, Jingjing Shen, Julien Valentin
[pdf]
[DOI]

EgoBody: Human Body Shape and Motion of Interacting People from Head-Mounted Devices
Siwei Zhang, Qianli Ma, Yan Zhang, Zhiyin Qian, Taein Kwon, Marc Pollefeys, Federica Bogo, Siyu Tang
[pdf]
[DOI]

Grasp’D: Differentiable Contact-Rich Grasp Synthesis for Multi-Fingered Hands
Dylan Turpin, Liquan Wang, Eric Heiden, Yun-Chun Chen, Miles Macklin, Stavros Tsogkas, Sven Dickinson, Animesh Garg
[pdf]
[DOI]

AutoAvatar: Autoregressive Neural Fields for Dynamic Avatar Modeling
Ziqian Bai, Timur Bagautdinov, Javier Romero, Michael Zollhöfer, Ping Tan, Shunsuke Saito
[pdf]
[DOI]

Deep Radial Embedding for Visual Sequence Learning
Yuecong Min, Peiqi Jiao, Yanan Li, Xiaotao Wang, Lei Lei, Xiujuan Chai, Xilin Chen
[pdf]
[DOI]

SAGA: Stochastic Whole-Body Grasping with Contact
Yan Wu, Jiahao Wang, Yan Zhang, Siwei Zhang, Otmar Hilliges, Fisher Yu, Siyu Tang
[pdf]
[DOI]

Neural Capture of Animatable 3D Human from Monocular Video
Gusi Te, Xiu Li, Xiao Li, Jinglu Wang, Wei Hu, Yan Lu
[pdf]
[DOI]

General Object Pose Transformation Network from Unpaired Data
Yukun Su, Guosheng Lin, Ruizhou Sun, Qingyao Wu
[pdf]
[DOI]

Compositional Human-Scene Interaction Synthesis with Semantic Control
Kaifeng Zhao, Shaofei Wang, Yan Zhang, Thabo Beeler, Siyu Tang
[pdf]
[DOI]

PressureVision: Estimating Hand Pressure from a Single RGB Image
Patrick Grady, Chengcheng Tang, Samarth Brahmbhatt, Christopher D. Twigg, Chengde Wan, James Hays, Charles C. Kemp
[pdf]
[DOI]

PoseScript: 3D Human Poses from Natural Language
Ginger Delmas, Philippe Weinzaepfel, Thomas Lucas, Francesc Moreno-Noguer, Grégory Rogez
[pdf]
[DOI]

DProST: Dynamic Projective Spatial Transformer Network for 6D Pose Estimation
Jaewoo Park, Nam Ik Cho
[pdf]
[DOI]

3D Interacting Hand Pose Estimation by Hand De-Occlusion and Removal
Hao Meng, Sheng Jin, Wentao Liu, Chen Qian, Mengxiang Lin, Wanli Ouyang, Ping Luo
[pdf]
[DOI]

Pose for Everything: Towards Category-Agnostic Pose Estimation
Lumin Xu, Sheng Jin, Wang Zeng, Wentao Liu, Chen Qian, Wanli Ouyang, Ping Luo, Xiaogang Wang
[pdf]
[DOI]

PoseGPT: Quantization-Based 3D Human Motion Generation and Forecasting
Thomas Lucas, Fabien Baradel, Philippe Weinzaepfel, Grégory Rogez
[pdf]
[DOI]

DH-AUG: DH Forward Kinematics Model Driven Augmentation for 3D Human Pose Estimation
Linzhi Huang, Jiahao Liang, Weihong Deng
[pdf]
[DOI]

Estimating Spatially-Varying Lighting in Urban Scenes with Disentangled Representation
Jiajun Tang, Yongjie Zhu, Haoyu Wang, Jun Hoong Chan, Si Li, Boxin Shi
[pdf]
[DOI]

Boosting Event Stream Super-Resolution with a Recurrent Neural Network
Wenming Weng, Yueyi Zhang, Zhiwei Xiong
[pdf]
[DOI]

Projective Parallel Single-Pixel Imaging to Overcome Global Illumination in 3D Structure Light Scanning
Yuxi Li, Huijie Zhao, Hongzhi Jiang, Xudong Li
[pdf]
[DOI]

Semantic-Sparse Colorization Network for Deep Exemplar-Based Colorization
Yunpeng Bai, Chao Dong, Zenghao Chai, Andong Wang, Zhengzhuo Xu, Chun Yuan
[pdf]
[DOI]

Practical and Scalable Desktop-Based High-Quality Facial Capture
Alexandros Lattas, Yiming Lin, Jayanth Kannan, Ekin Ozturk, Luca Filipi, Giuseppe Claudio Guarnera, Gaurav Chawla, Abhijeet Ghosh
[pdf]
[DOI]

FAST-VQA: Efficient End-to-End Video Quality Assessment with Fragment Sampling
Haoning Wu, Chaofeng Chen, Jingwen Hou, Liang Liao, Annan Wang, Wenxiu Sun, Qiong Yan, Weisi Lin
[pdf]
[DOI]

Physically-Based Editing of Indoor Scene Lighting from a Single Image
Zhengqin Li, Jia Shi, Sai Bi, Rui Zhu, Kalyan Sunkavalli, Miloš Hašan, Zexiang Xu, Ravi Ramamoorthi, Manmohan Chandraker
[pdf]
[DOI]

LEDNet: Joint Low-Light Enhancement and Deblurring in the Dark
Shangchen Zhou, Chongyi Li, Chen Change Loy
[pdf]
[DOI]

MPIB: An MPI-Based Bokeh Rendering Framework for Realistic Partial Occlusion Effects
Juewen Peng, Jianming Zhang, Xianrui Luo, Hao Lu, Ke Xian, Zhiguo Cao
[pdf]
[DOI]

Real-RawVSR: Real-World Raw Video Super-Resolution with a Benchmark Dataset
Huanjing Yue, Zhiming Zhang, Jingyu Yang
[pdf]
[DOI]

Transform Your Smartphone into a DSLR Camera: Learning the ISP in the Wild
Ardhendu Shekhar Tripathi, Martin Danelljan, Samarth Shukla, Radu Timofte, Luc Van Gool
[pdf]
[DOI]

Learning Deep Non-Blind Image Deconvolution without Ground Truths
Yuhui Quan, Zhuojie Chen, Huan Zheng, Hui Ji
[pdf]
[DOI]

NEST: Neural Event Stack for Event-Based Image Enhancement
Minggui Teng, Chu Zhou, Hanyue Lou, Boxin Shi
[pdf]
[DOI]

Editable Indoor Lighting Estimation
Henrique Weber, Mathieu Garon, Jean-François Lalonde
[pdf]
[DOI]

Fast Two-Step Blind Optical Aberration Correction
Thomas Eboli, Jean-Michel Morel, Gabriele Facciolo
[pdf]
[DOI]

Seeing Far in the Dark with Patterned Flash
Zhanghao Sun, Jian Wang, Yicheng Wu, Shree Nayar
[pdf]
[DOI]

PseudoClick: Interactive Image Segmentation with Click Imitation
Qin Liu, Meng Zheng, Benjamin Planche, Srikrishna Karanam, Terrence Chen, Marc Niethammer, Ziyan Wu
[pdf]
[DOI]

CT2: Colorization Transformer via Color Tokens
Shuchen Weng, Jimeng Sun, Yu Li, Si Li, Boxin Shi
[pdf]
[DOI]

Simple Baselines for Image Restoration
Liangyu Chen, Xiaojie Chu, Xiangyu Zhang, Jian Sun
[pdf]
[DOI]

Spike Transformer: Monocular Depth Estimation for Spiking Camera
Jiyuan Zhang, Lulu Tang, Zhaofei Yu, Jiwen Lu, Tiejun Huang
[pdf]
[DOI]

Improving Image Restoration by Revisiting Global Information Aggregation
Xiaojie Chu, Liangyu Chen, Chengpeng Chen, Xin Lu
[pdf]
[DOI]

Data Association between Event Streams and Intensity Frames under Diverse Baselines
Dehao Zhang, Qiankun Ding, Peiqi Duan, Chu Zhou, Boxin Shi
[pdf]
[DOI]

D2HNet: Joint Denoising and Deblurring with Hierarchical Network for Robust Night Image Restoration
Yuzhi Zhao, Yongzhe Xu, Qiong Yan, Dingdong Yang, Xuehui Wang, Lai-Man Po
[pdf]
[DOI]

Learning Graph Neural Networks for Image Style Transfer
Yongcheng Jing, Yining Mao, Yiding Yang, Yibing Zhan, Mingli Song, Xinchao Wang, Dacheng Tao
[pdf]
[DOI]

DeepPS2: Revisiting Photometric Stereo Using Two Differently Illuminated Images
Ashish Tiwari, Shanmuganathan Raman
[pdf]
[DOI]

Instance Contour Adjustment via Structure-Driven CNN
Shuchen Weng, Yi Wei, Ming-Ching Chang, Boxin Shi
[pdf]
[DOI]

Synthesizing Light Field Video from Monocular Video
Shrisudhan Govindarajan, Prasan Shedligeri, Sarah, Kaushik Mitra
[pdf]
[DOI]

Human-Centric Image Cropping with Partition-Aware and Content-Preserving Features
Bo Zhang, Li Niu, Xing Zhao, Liqing Zhang
[pdf]
[DOI]

DeMFI: Deep Joint Deblurring and Multi-Frame Interpolation with Flow-Guided Attentive Correlation and Recursive Boosting
Jihyong Oh, Munchurl Kim
[pdf]
[DOI]

Neural Image Representations for Multi-Image Fusion and Layer Separation
Seonghyeon Nam, Marcus A. Brubaker, Michael S. Brown
[pdf]
[DOI]

Bringing Rolling Shutter Images Alive with Dual Reversed Distortion
Zhihang Zhong, Mingdeng Cao, Xiao Sun, Zhirong Wu, Zhongyi Zhou, Yinqiang Zheng, Stephen Lin, Imari Sato
[pdf]
[DOI]

FILM: Frame Interpolation for Large Motion
Fitsum Reda, Janne Kontkanen, Eric Tabellion, Deqing Sun, Caroline Pantofaru, Brian Curless
[pdf]
[DOI]

Video Interpolation by Event-Driven Anisotropic Adjustment of Optical Flow
Song Wu, Kaichao You, Weihua He, Chen Yang, Yang Tian, Yaoyuan Wang, Ziyang Zhang, Jianxing Liao
[pdf]
[DOI]

EvAC3D: From Event-Based Apparent Contours to 3D Models via Continuous Visual Hulls
Ziyun Wang, Kenneth Chaney, Kostas Daniilidis
[pdf]
[DOI]

DCCF: Deep Comprehensible Color Filter Learning Framework for High-Resolution Image Harmonization
Ben Xue, Shenghui Ran, Quan Chen, Rongfei Jia, Binqiang Zhao, Xing Tang
[pdf]
[DOI]

SelectionConv: Convolutional Neural Networks for Non-Rectilinear Image Data
David Hart, Michael Whitney, Bryan Morse
[pdf]
[DOI]

Spatial-Separated Curve Rendering Network for Efficient and High-Resolution Image Harmonization
Jingtang Liang, Xiaodong Cun, Chi-Man Pun, Jue Wang
[pdf]
[DOI]

BigColor: Colorization Using a Generative Color Prior for Natural Images
Geonung Kim, Kyoungkook Kang, Seongtae Kim, Hwayoon Lee, Sehoon Kim, Jonghyun Kim, Seung-Hwan Baek, Sunghyun Cho
[pdf]
[DOI]

CADyQ: Content-Aware Dynamic Quantization for Image Super-Resolution
Cheeun Hong, Sungyong Baik, Heewon Kim, Seungjun Nah, Kyoung Mu Lee
[pdf]
[DOI]

Deep Semantic Statistics Matching (D2SM) Denoising Network
Kangfu Mei, Vishal M. Patel, Rui Huang
[pdf]
[DOI]

3D Scene Inference from Transient Histograms
Sacha Jungerman, Atul Ingle, Yin Li, Mohit Gupta
[pdf]
[DOI]

Neural Space-Filling Curves
Hanyu Wang, Kamal Gupta, Larry Davis, Abhinav Shrivastava
[pdf]
[DOI]

Exposure-Aware Dynamic Weighted Learning for Single-Shot HDR Imaging
An Gia Vien, Chul Lee
[pdf]
[DOI]

Seeing through a Black Box: Toward High-Quality Terahertz Imaging via Subspace-and-Attention Guided Restoration
Weng-Tai Su, Yi-Chun Hung, Po-Jen Yu, Shang-Hua Yang, Chia-Wen Lin
[pdf]
[DOI]

Tomography of Turbulence Strength Based on Scintillation Imaging
Nir Shaul, Yoav Y. Schechner
[pdf]
[DOI]

Realistic Blur Synthesis for Learning Image Deblurring
Jaesung Rim, Geonung Kim, Jungeon Kim, Junyong Lee, Seungyong Lee, Sunghyun Cho
[pdf]
[DOI]

Learning Phase Mask for Privacy-Preserving Passive Depth Estimation
Zaid Tasneem, Giovanni Milione, Yi-Hsuan Tsai, Xiang Yu, Ashok Veeraraghavan, Manmohan Chandraker, Francesco Pittaluga
[pdf]
[DOI]

LWGNet – Learned Wirtinger Gradients for Fourier Ptychographic Phase Retrieval
Atreyee Saha, Salman S. Khan, Sagar Sehrawat, Sanjana S. Prabhu, Shanti Bhattacharya, Kaushik Mitra
[pdf]
[DOI]

PANDORA: Polarization-Aided Neural Decomposition of Radiance
Akshat Dave, Yongyi Zhao, Ashok Veeraraghavan
[pdf]
[DOI]

HuMMan: Multi-modal 4D Human Dataset for Versatile Sensing and Modeling
Zhongang Cai, Daxuan Ren, Ailing Zeng, Zhengyu Lin, Tao Yu, Wenjia Wang, Xiangyu Fan, Yang Gao, Yifan Yu, Liang Pan, Fangzhou Hong, Mingyuan Zhang, Chen Change Loy, Lei Yang, Ziwei Liu
[pdf]
[DOI]

DVS-Voltmeter: Stochastic Process-Based Event Simulator for Dynamic Vision Sensors
Songnan Lin, Ye Ma, Zhenhua Guo, Bihan Wen
[pdf]
[DOI]

Benchmarking Omni-Vision Representation through the Lens of Visual Realms
Yuanhan Zhang, Zhenfei Yin, Jing Shao, Ziwei Liu
[pdf]
[DOI]

BEAT: A Large-Scale Semantic and Emotional Multi-modal Dataset for Conversational Gestures Synthesis
Haiyang Liu, Zihao Zhu, Naoya Iwamoto, Yichen Peng, Zhengqing Li, You Zhou, Elif Bozkurt, Bo Zheng
[pdf]
[DOI]

Neuromorphic Data Augmentation for Training Spiking Neural Networks
Yuhang Li, Youngeun Kim, Hyoungseob Park, Tamar Geller, Priyadarshini Panda
[pdf]
[DOI]

CelebV-HQ: A Large-Scale Video Facial Attributes Dataset
Hao Zhu, Wayne Wu, Wentao Zhu, Liming Jiang, Siwei Tang, Li Zhang, Ziwei Liu, Chen Change Loy
[pdf]
[DOI]

MovieCuts: A New Dataset and Benchmark for Cut Type Recognition
Alejandro Pardo, Fabian Caba, Juan León Alcázar, Ali Thabet, Bernard Ghanem
[pdf]
[DOI]

LaMAR: Benchmarking Localization and Mapping for Augmented Reality
Paul-Edouard Sarlin, Mihai Dusmanu, Johannes L. Schönberger, Pablo Speciale, Lukas Gruber, Viktor Larsson, Ondrej Miksik, Marc Pollefeys
[pdf]
[DOI]

"Unitail: Detecting, Reading, and Matching in Retail Scene"
Fangyi Chen, Han Zhang, Zaiwang Li, Jiachen Dou, Shentong Mo, Hao Chen, Yongxin Zhang, Uzair Ahmed, Chenchen Zhu, Marios Savvides
[pdf]
[DOI]

Not Just Streaks: Towards Ground Truth for Single Image Deraining
Yunhao Ba, Howard Zhang, Ethan Yang, Akira Suzuki, Arnold Pfahnl, Chethan Chinder Chandrappa, Celso M. de Melo, Suya You, Stefano Soatto, Alex Wong, Achuta Kadambi
[pdf]
[DOI]

ECCV Caption: Correcting False Negatives by Collecting Machine-and-Human-Verified Image-Caption Associations for MS-COCO
Sanghyuk Chun, Wonjae Kim, Song Park, Minsuk Chang, Seong Joon Oh
[pdf]
[DOI]

MOTCOM: The Multi-Object Tracking Dataset Complexity Metric
Malte Pedersen, Joakim Bruslund Haurum, Patrick Dendorfer, Thomas B. Moeslund
[pdf]
[DOI]

How to Synthesize a Large-Scale and Trainable Micro-Expression Dataset?
Yuchi Liu, Zhongdao Wang, Tom Gedeon, Liang Zheng
[pdf]
[DOI]

A Real World Dataset for Multi-View 3D Reconstruction
Rakesh Shrestha, Siqi Hu, Minghao Gou, Ziyuan Liu, Ping Tan
[pdf]
[DOI]

REALY: Rethinking the Evaluation of 3D Face Reconstruction
Zenghao Chai, Haoxian Zhang, Jing Ren, Di Kang, Zhengzhuo Xu, Xuefei Zhe, Chun Yuan, Linchao Bao
[pdf]
[DOI]

"Capturing, Reconstructing, and Simulating: The UrbanScene3D Dataset"
Liqiang Lin, Yilin Liu, Yue Hu, Xingguang Yan, Ke Xie, Hui Huang
[pdf]
[DOI]

3D CoMPaT: Composition of Materials on Parts of 3D Things
Yuchen Li, Ujjwal Upadhyay, Habib Slim, Tezuesh Varshney, Ahmed Abdelreheem, Arpit Prajapati, Suhail Pothigara, Peter Wonka, Mohamed Elhoseiny
[pdf]
[DOI]

"PartImageNet: A Large, High-Quality Dataset of Parts"
Ju He, Shuo Yang, Shaokang Yang, Adam Kortylewski, Xiaoding Yuan, Jie-Neng Chen, Shuai Liu, Cheng Yang, Qihang Yu, Alan Yuille
[pdf]
[DOI]

A-OKVQA: A Benchmark for Visual Question Answering Using World Knowledge
Dustin Schwenk, Apoorv Khandelwal, Christopher Clark, Kenneth Marino, Roozbeh Mottaghi
[pdf]
[DOI]

OOD-CV: A Benchmark for Robustness to Out-of-Distribution Shifts of Individual Nuisances in Natural Images
Bingchen Zhao, Shaozuo Yu, Wufei Ma, Mingxin Yu, Shenxiao Mei, Angtian Wang, Ju He, Alan Yuille, Adam Kortylewski
[pdf]
[DOI]

Facial Depth and Normal Estimation Using Single Dual-Pixel Camera
Minjun Kang, Jaesung Choe, Hyowon Ha, Hae-Gon Jeon, Sunghoon Im, In So Kweon, Kuk-Jin Yoon
[pdf]
[DOI]

The Anatomy of Video Editing: A Dataset and Benchmark Suite for AI-Assisted Video Editing
Dawit Mureja Argaw, Fabian Caba, Joon-Young Lee, Markus Woodson, In So Kweon
[pdf]
[DOI]

StyleBabel: Artistic Style Tagging and Captioning
Dan Ruta, Andrew Gilbert, Pranav Aggarwal, Naveen Marri, Ajinkya Kale, Jo Briggs, Chris Speed, Hailin Jin, Baldo Faieta, Alex Filipkowski, Zhe Lin, John Collomosse
[pdf]
[DOI]

PANDORA: A Panoramic Detection Dataset for Object with Orientation
Hang Xu, Qiang Zhao, Yike Ma, Xiaodong Li, Peng Yuan, Bailan Feng, Chenggang Yan, Feng Dai
[pdf]
[DOI]

FS-COCO: Towards Understanding of Freehand Sketches of Common Objects in Context
Pinaki Nath Chowdhury, Aneeshan Sain, Ayan Kumar Bhunia, Tao Xiang, Yulia Gryaditskaya, Yi-Zhe Song
[pdf]
[DOI]

Exploring Fine-Grained Audiovisual Categorization with the SSW60 Dataset
Grant Van Horn, Rui Qian, Kimberly Wilber, Hartwig Adam, Oisin Mac Aodha, Serge Belongie
[pdf]
[DOI]

The Caltech Fish Counting Dataset: A Benchmark for Multiple-Object Tracking and Counting
Justin Kay, Peter Kulits, Suzanne Stathatos, Siqi Deng, Erik Young, Sara Beery, Grant Van Horn, Pietro Perona
[pdf]
[DOI]

A Dataset for Interactive Vision-Language Navigation with Unknown Command Feasibility
Andrea Burns, Deniz Arsan, Sanjna Agrawal, Ranjitha Kumar, Kate Saenko, Bryan A. Plummer
[pdf]
[DOI]

BRACE: The Breakdancing Competition Dataset for Dance Motion Synthesis
Davide Moltisanti, Jinyi Wu, Bo Dai, Chen Change Loy
[pdf]
[DOI]

Dress Code: High-Resolution Multi-Category Virtual Try-On
Davide Morelli, Matteo Fincato, Marcella Cornia, Federico Landi, Fabio Cesari, Rita Cucchiara
[pdf]
[DOI]

A Data-Centric Approach for Improving Ambiguous Labels with Combined Semi-Supervised Classification and Clustering
Lars Schmarje, Monty Santarossa, Simon-Martin Schröder, Claudius Zelenka, Rainer Kiko, Jenny Stracke, Nina Volkmann, Reinhard Koch
[pdf]
[DOI]

ClearPose: Large-Scale Transparent Object Dataset and Benchmark
Xiaotong Chen, Huijie Zhang, Zeren Yu, Anthony Opipari, Odest Chadwicke Jenkins
[pdf]
[DOI]

When Deep Classifiers Agree: Analyzing Correlations between Learning Order and Image Statistics
Iuliia Pliushch, Martin Mundt, Nicolas Lupp, Visvanathan Ramesh
[pdf]
[DOI]

AnimeCeleb: Large-Scale Animation CelebHeads Dataset for Head Reenactment
Kangyeol Kim, Sunghyun Park, Jaeseong Lee, Sunghyo Chung, Junsoo Lee, Jaegul Choo
[pdf]
[DOI]

MUGEN: A Playground for Video-Audio-Text Multimodal Understanding and GENeration
Thomas Hayes, Songyang Zhang, Xi Yin, Guan Pang, Sasha Sheng, Harry Yang, Songwei Ge, Qiyuan Hu, Devi Parikh
[pdf]
[DOI]

A Dense Material Segmentation Dataset for Indoor and Outdoor Scene Parsing
Paul Upchurch, Ransen Niu
[pdf]
[DOI]

MimicME: A Large Scale Diverse 4D Database for Facial Expression Analysis
Athanasios Papaioannou, Baris Gecer, Shiyang Cheng, Grigorios G. Chrysos, Jiankang Deng, Eftychia Fotiadou, Christos Kampouris, Dimitrios Kollias, Stylianos Moschoglou, Kritaphat Songsri-In, Stylianos Ploumpis, George Trigeorgis, Panagiotis Tzirakis, Evangelos Ververas, Yuxiang Zhou, Allan Ponniah, Anastasios Roussos, Stefanos Zafeiriou
[pdf]
[DOI]

"Delving into Universal Lesion Segmentation: Method, Dataset, and Benchmark"
Yu Qiu, Jing Xu
[pdf]
[DOI]

Large Scale Real-World Multi-person Tracking
Bing Shuai, Alessandro Bergamo, Uta Büchler, Andrew Berneshawi, Alyssa Boden, Joseph Tighe
[pdf]
[DOI]

D2-TPred: Discontinuous Dependency for Trajectory Prediction under Traffic Lights
Yuzhen Zhang, Wentong Wang, Weizhi Guo, Pei Lv, Mingliang Xu, Wei Chen, Dinesh Manocha
[pdf]
[DOI]

The Missing Link: Finding Label Relations across Datasets
Jasper Uijlings, Thomas Mensink, Vittorio Ferrari
[pdf]
[DOI]

Learning Omnidirectional Flow in 360° Video via Siamese Representation
Keshav Bhandari, Bin Duan, Gaowen Liu, Hugo Latapie, Ziliang Zong, Yan Yan
[pdf]
[DOI]

VizWiz-FewShot: Locating Objects in Images Taken by People with Visual Impairments
Yu-Yun Tseng, Alexander Bell, Danna Gurari
[pdf]
[DOI]

TRoVE: Transforming Road Scene Datasets into Photorealistic Virtual Environments
Shubham Dokania, Anbumani Subramanian, Manmohan Chandraker, C.V. Jawahar
[pdf]
[DOI]

Trapped in Texture Bias? A Large Scale Comparison of Deep Instance Segmentation
Johannes Theodoridis, Jessica Hofmann, Johannes Maucher, Andreas Schilling
[pdf]
[DOI]

Deformable Feature Aggregation for Dynamic Multi-modal 3D Object Detection
Zehui Chen, Zhenyu Li, Shiquan Zhang, Liangji Fang, Qinhong Jiang, Feng Zhao
[pdf]
[DOI]

WeLSA: Learning to Predict 6D Pose from Weakly Labeled Data Using Shape Alignment
Shishir Reddy Vutukur, Ivan Shugurov, Benjamin Busam, Andreas Hutter, Slobodan Ilic
[pdf]
[DOI]

Graph R-CNN: Towards Accurate 3D Object Detection with Semantic-Decorated Local Graph
Honghui Yang, Zili Liu, Xiaopei Wu, Wenxiao Wang, Wei Qian, Xiaofei He, Deng Cai
[pdf]
[DOI]

MPPNet: Multi-Frame Feature Intertwining with Proxy Points for 3D Temporal Object Detection
Xuesong Chen, Shaoshuai Shi, Benjin Zhu, Ka Chun Cheung, Hang Xu, Hongsheng Li
[pdf]
[DOI]

Long-Tail Detection with Effective Class-Margins
Jang Hyun Cho, Philipp Krähenbühl
[pdf]
[DOI]

Semi-Supervised Monocular 3D Object Detection by Multi-View Consistency
Qing Lian, Yanbo Xu, Weilong Yao, Yingcong Chen, Tong Zhang
[pdf]
[DOI]

PTSEFormer: Progressive Temporal-Spatial Enhanced TransFormer towards Video Object Detection
Han Wang, Jun Tang, Xiaodong Liu, Shanyan Guan, Rong Xie, Li Song
[pdf]
[DOI]

BEVFormer: Learning Bird’s-Eye-View Representation from Multi-Camera Images via Spatiotemporal Transformers
Zhiqi Li, Wenhai Wang, Hongyang Li, Enze Xie, Chonghao Sima, Tong Lu, Yu Qiao, Jifeng Dai
[pdf]
[DOI]

Category-Level 6D Object Pose and Size Estimation Using Self-Supervised Deep Prior Deformation Networks
Jiehong Lin, Zewei Wei, Changxing Ding, Kui Jia
[pdf]
[DOI]

Dense Teacher: Dense Pseudo-Labels for Semi-Supervised Object Detection
Hongyu Zhou, Zheng Ge, Songtao Liu, Weixin Mao, Zeming Li, Haiyan Yu, Jian Sun
[pdf]
[DOI]

Point-to-Box Network for Accurate Object Detection via Single Point Supervision
Pengfei Chen, Xuehui Yu, Xumeng Han, Najmul Hassan, Kai Wang, Jiachen Li, Jian Zhao, Humphrey Shi, Zhenjun Han, Qixiang Ye
[pdf]
[DOI]

Domain Adaptive Hand Keypoint and Pixel Localization in the Wild
Takehiko Ohkawa, Yu-Jhe Li, Qichen Fu, Ryosuke Furuta, Kris M. Kitani, Yoichi Sato
[pdf]
[DOI]

Towards Data-Efficient Detection Transformers
Wen Wang, Jing Zhang, Yang Cao, Yongliang Shen, Dacheng Tao
[pdf]
[DOI]

Open-Vocabulary DETR with Conditional Matching
Yuhang Zang, Wei Li, Kaiyang Zhou, Chen Huang, Chen Change Loy
[pdf]
[DOI]

Prediction-Guided Distillation for Dense Object Detection
Chenhongyi Yang, Mateusz Ochal, Amos Storkey, Elliot J. Crowley
[pdf]
[DOI]

Multimodal Object Detection via Probabilistic Ensembling
Yi-Ting Chen, Jinghao Shi, Zelin Ye, Christoph Mertz, Deva Ramanan, Shu Kong
[pdf]
[DOI]

Exploiting Unlabeled Data with Vision and Language Models for Object Detection
Shiyu Zhao, Zhixing Zhang, Samuel Schulter, Long Zhao, Vijay Kumar B G, Anastasis Stathopoulos, Manmohan Chandraker, Dimitris N. Metaxas
[pdf]
[DOI]

CPO: Change Robust Panorama to Point Cloud Localization
Junho Kim, Hojun Jang, Changwoon Choi, Young Min Kim
[pdf]
[DOI]

INT: Towards Infinite-Frames 3D Detection with an Efficient Framework
Jianyun Xu, Zhenwei Miao, Da Zhang, Hongyu Pan, Kaixuan Liu, Peihan Hao, Jun Zhu, Zhengyang Sun, Hongmin Li, Xin Zhan
[pdf]
[DOI]

End-to-End Weakly Supervised Object Detection with Sparse Proposal Evolution
Mingxiang Liao, Fang Wan, Yuan Yao, Zhenjun Han, Jialing Zou, Yuze Wang, Bailan Feng, Peng Yuan, Qixiang Ye
[pdf]
[DOI]

Calibration-Free Multi-View Crowd Counting
Qi Zhang, Antoni B. Chan
[pdf]
[DOI]

Unsupervised Domain Adaptation for Monocular 3D Object Detection via Self-Training
Zhenyu Li, Zehui Chen, Ang Li, Liangji Fang, Qinhong Jiang, Xianming Liu, Junjun Jiang
[pdf]
[DOI]

SuperLine3D: Self-Supervised Line Segmentation and Description for LiDAR Point Cloud
Xiangrui Zhao, Sheng Yang, Tianxin Huang, Jun Chen, Teng Ma, Mingyang Li, Yong Liu
[pdf]
[DOI]

Exploring Plain Vision Transformer Backbones for Object Detection
Yanghao Li, Hanzi Mao, Ross Girshick, Kaiming He
[pdf]
[DOI]

Adversarially-Aware Robust Object Detector
Ziyi Dong, Pengxu Wei, Liang Lin
[pdf]
[DOI]

HEAD: HEtero-Assists Distillation for Heterogeneous Object Detectors
Luting Wang, Xiaojie Li, Yue Liao, Zeren Jiang, Jianlong Wu, Fei Wang, Chen Qian, Si Liu
[pdf]
[DOI]

You Should Look at All Objects
Zhenchao Jin, Dongdong Yu, Luchuan Song, Zehuan Yuan, Lequan Yu
[pdf]
[DOI]

Detecting Twenty-Thousand Classes Using Image-Level Supervision
Xingyi Zhou, Rohit Girdhar, Armand Joulin, Philipp Krähenbühl, Ishan Misra
[pdf]
[DOI]

DCL-Net: Deep Correspondence Learning Network for 6D Pose Estimation
Hongyang Li, Jiehong Lin, Kui Jia
[pdf]
[DOI]

Monocular 3D Object Detection with Depth from Motion
Tai Wang, Jiangmiao Pang, Dahua Lin
[pdf]
[DOI]

DISP6D: Disentangled Implicit Shape and Pose Learning for Scalable 6D Pose Estimation
Yilin Wen, Xiangyu Li, Hao Pan, Lei Yang, Zheng Wang, Taku Komura, Wenping Wang
[pdf]
[DOI]

Distilling Object Detectors with Global Knowledge
Sanli Tang, Zhongyu Zhang, Zhanzhan Cheng, Jing Lu, Yunlu Xu, Yi Niu, Fan He
[pdf]
[DOI]

Unifying Visual Perception by Dispersible Points Learning
Jianming Liang, Guanglu Song, Biao Leng, Yu Liu
[pdf]
[DOI]

PseCo: Pseudo Labeling and Consistency Training for Semi-Supervised Object Detection
Gang Li, Xiang Li, Yujie Wang, Yichao Wu, Ding Liang, Shanshan Zhang
[pdf]
[DOI]

Exploring Resolution and Degradation Clues As Self-Supervised Signal for Low Quality Object Detection
Ziteng Cui, Yingying Zhu, Lin Gu, Guo-Jun Qi, Xiaoxiao Li, Renrui Zhang, Zenghui Zhang, Tatsuya Harada
[pdf]
[DOI]

Robust Category-Level 6D Pose Estimation with Coarse-to-Fine Rendering of Neural Features
Wufei Ma, Angtian Wang, Alan Yuille, Adam Kortylewski
[pdf]
[DOI]

"Translation, Scale and Rotation: Cross-Modal Alignment Meets RGB-Infrared Vehicle Detection"
Maoxun Yuan, Yinyan Wang, Xingxing Wei
[pdf]
[DOI]

RFLA: Gaussian Receptive Field Based Label Assignment for Tiny Object Detection
Chang Xu, Jinwang Wang, Wen Yang, Huai Yu, Lei Yu, Gui-Song Xia
[pdf]
[DOI]

Rethinking IoU-Based Optimization for Single-Stage 3D Object Detection
Hualian Sheng, Sijia Cai, Na Zhao, Bing Deng, Jianqiang Huang, Xian-Sheng Hua, Min-Jian Zhao, Gim Hee Lee
[pdf]
[DOI]

TD-Road: Top-Down Road Network Extraction with Holistic Graph Construction
Yang He, Ravi Garg, Amber Roy Chowdhury
[pdf]
[DOI]

Multi-faceted Distillation of Base-Novel Commonality for Few-Shot Object Detection
Shuang Wu, Wenjie Pei, Dianwen Mei, Fanglin Chen, Jiandong Tian, Guangming Lu
[pdf]
[DOI]

PointCLM: A Contrastive Learning-Based Framework for Multi-Instance Point Cloud Registration
Mingzhi Yuan, Zhihao Li, Qiuye Jin, Xinrong Chen, Manning Wang
[pdf]
[DOI]

Weakly Supervised Object Localization via Transformer with Implicit Spatial Calibration
Haotian Bai, Ruimao Zhang, Jiong Wang, Xiang Wan
[pdf]
[DOI]

MTTrans: Cross-Domain Object Detection with Mean Teacher Transformer
Jinze Yu, Jiaming Liu, Xiaobao Wei, Haoyi Zhou, Yohei Nakata, Denis Gudovskiy, Tomoyuki Okuno, Jianxin Li, Kurt Keutzer, Shanghang Zhang
[pdf]
[DOI]

Multi-Domain Multi-Definition Landmark Localization for Small Datasets
David Ferman, Gaurav Bharaj
[pdf]
[DOI]

DEVIANT: Depth EquiVarIAnt NeTwork for Monocular 3D Object Detection
Abhinav Kumar, Garrick Brazil, Enrique Corona, Armin Parchami, Xiaoming Liu
[pdf]
[DOI]

Label-Guided Auxiliary Training Improves 3D Object Detector
Yaomin Huang, Xinmei Liu, Yichen Zhu, Zhiyuan Xu, Chaomin Shen, Zhengping Che, Guixu Zhang, Yaxin Peng, Feifei Feng, Jian Tang
[pdf]
[DOI]

PromptDet: Towards Open-Vocabulary Detection Using Uncurated Images
Chengjian Feng, Yujie Zhong, Zequn Jie, Xiangxiang Chu, Haibing Ren, Xiaolin Wei, Weidi Xie, Lin Ma
[pdf]
[DOI]

Densely Constrained Depth Estimator for Monocular 3D Object Detection
Yingyan Li, Yuntao Chen, Jiawei He, Zhaoxiang Zhang
[pdf]
[DOI]

Polarimetric Pose Prediction
Daoyi Gao, Yitong Li, Patrick Ruhkamp, Iuliia Skobleva, Magdalena Wysocki, HyunJun Jung, Pengyuan Wang, Arturo Guridi, Benjamin Busam
[pdf]
[DOI]

DFNet: Enhance Absolute Pose Regression with Direct Feature Matching
Shuai Chen, Xinghui Li, Zirui Wang, Victor Adrian Prisacariu
[pdf]
[DOI]

Cornerformer: Purifying Instances for Corner-Based Detectors
Haoran Wei, Xin Chen, Lingxi Xie, Qi Tian
[pdf]
[DOI]

PillarNet: Real-Time and High-Performance Pillar-Based 3D Object Detection
Guangsheng Shi, Ruifeng Li, Chao Ma
[pdf]
[DOI]

Robust Object Detection with Inaccurate Bounding Boxes
Chengxin Liu, Kewei Wang, Hao Lu, Zhiguo Cao, Ziming Zhang
[pdf]
[DOI]

Efficient Decoder-Free Object Detection with Transformers
Peixian Chen, Mengdan Zhang, Yunhang Shen, Kekai Sheng, Yuting Gao, Xing Sun, Ke Li, Chunhua Shen
[pdf]
[DOI]

Cross-Modality Knowledge Distillation Network for Monocular 3D Object Detection
Yu Hong, Hang Dai, Yong Ding
[pdf]
[DOI]

ReAct: Temporal Action Detection with Relational Queries
Dingfeng Shi, Yujie Zhong, Qiong Cao, Jing Zhang, Lin Ma, Jia Li, Dacheng Tao
[pdf]
[DOI]

Towards Accurate Active Camera Localization
Qihang Fang, Yingda Yin, Qingnan Fan, Fei Xia, Siyan Dong, Sheng Wang, Jue Wang, Leonidas J. Guibas, Baoquan Chen
[pdf]
[DOI]

Camera Pose Auto-Encoders for Improving Pose Regression
Yoli Shavit, Yosi Keller
[pdf]
[DOI]

Improving the Intra-Class Long-Tail in 3D Detection via Rare Example Mining
Chiyu Max Jiang, Mahyar Najibi, Charles R. Qi, Yin Zhou, Dragomir Anguelov
[pdf]
[DOI]

Bagging Regional Classification Activation Maps for Weakly Supervised Object Localization
Lei Zhu, Qian Chen, Lujia Jin, Yunfei You, Yanye Lu
[pdf]
[DOI]

UC-OWOD: Unknown-Classified Open World Object Detection
Zhiheng Wu, Yue Lu, Xingyu Chen, Zhengxing Wu, Liwen Kang, Junzhi Yu
[pdf]
[DOI]

RayTran: 3D Pose Estimation and Shape Reconstruction of Multiple Objects from Videos with Ray-Traced Transformers
Michał J. Tyszkiewicz, Kevis-Kokitsi Maninis, Stefan Popov, Vittorio Ferrari
[pdf]
[DOI]

GTCaR: Graph Transformer for Camera Re-Localization
Xinyi Li, Haibin Ling
[pdf]
[DOI]

3D Object Detection with a Self-Supervised Lidar Scene Flow Backbone
Emeç Erçelik, Ekim Yurtsever, Mingyu Liu, Zhijie Yang, Hanzhen Zhang, Pınar Topçam, Maximilian Listl, Yılmaz Kaan Çaylı, Alois Knoll
[pdf]
[DOI]

Open Vocabulary Object Detection with Pseudo Bounding-Box Labels
Mingfei Gao, Chen Xing, Juan Carlos Niebles, Junnan Li, Ran Xu, Wenhao Liu, Caiming Xiong
[pdf]
[DOI]

Few-Shot Object Detection by Knowledge Distillation Using Bag-of-Visual-Words Representations
Wenjie Pei, Shuang Wu, Dianwen Mei, Fanglin Chen, Jiandong Tian, Guangming Lu
[pdf]
[DOI]

SALISA: Saliency-Based Input Sampling for Efficient Video Object Detection
Babak Ehteshami Bejnordi, Amirhossein Habibian, Fatih Porikli, Amir Ghodrati
[pdf]
[DOI]

ECO-TR: Efficient Correspondences Finding via Coarse-to-Fine Refinement
Dongli Tan, Jiang-Jiang Liu, Xingyu Chen, Chao Chen, Ruixin Zhang, Yunhang Shen, Shouhong Ding, Rongrong Ji
[pdf]
[DOI]

Vote from the Center: 6 DoF Pose Estimation in RGB-D Images by Radial Keypoint Voting
Yangzheng Wu, Mohsen Zand, Ali Etemad, Michael Greenspan
[pdf]
[DOI]

Long-Tailed Instance Segmentation Using Gumbel Optimized Loss
Konstantinos Panagiotis Alexandridis, Jiankang Deng, Anh Nguyen, Shan Luo
[pdf]
[DOI]

DetMatch: Two Teachers Are Better than One for Joint 2D and 3D Semi-Supervised Object Detection
Jinhyung Park, Chenfeng Xu, Yiyang Zhou, Masayoshi Tomizuka, Wei Zhan
[pdf]
[DOI]

ObjectBox: From Centers to Boxes for Anchor-Free Object Detection
Mohsen Zand, Ali Etemad, Michael Greenspan
[pdf]
[DOI]

Is Geometry Enough for Matching in Visual Localization?
Qunjie Zhou, Sérgio Agostinho, Aljoša Ošep, Laura Leal-Taixé
[pdf]
[DOI]

SWFormer: Sparse Window Transformer for 3D Object Detection in Point Clouds
Pei Sun, Mingxing Tan, Weiyue Wang, Chenxi Liu, Fei Xia, Zhaoqi Leng, Dragomir Anguelov
[pdf]
[DOI]

PCR-CG: Point Cloud Registration via Deep Explicit Color and Geometry
Yu Zhang, Junle Yu, Xiaolin Huang, Wenhui Zhou, Ji Hou
[pdf]
[DOI]

GLAMD: Global and Local Attention Mask Distillation for Object Detectors
Younho Jang, Wheemyung Shin, Jinbeom Kim, Simon Woo, Sung-Ho Bae
[pdf]
[DOI]

FCAF3D: Fully Convolutional Anchor-Free 3D Object Detection
Danila Rukhovich, Anna Vorontsova, Anton Konushin
[pdf]
[DOI]

Video Anomaly Detection by Solving Decoupled Spatio-Temporal Jigsaw Puzzles
Guodong Wang, Yunhong Wang, Jie Qin, Dongming Zhang, Xiuguo Bao, Di Huang
[pdf]
[DOI]

Class-Agnostic Object Detection with Multi-modal Transformer
Muhammad Maaz, Hanoona Rasheed, Salman Khan, Fahad Shahbaz Khan, Rao Muhammad Anwer, Ming-Hsuan Yang
[pdf]
[DOI]

Enhancing Multi-modal Features Using Local Self-Attention for 3D Object Detection
Hao Li, Zehan Zhang, Xian Zhao, Yulong Wang, Yuxi Shen, Shiliang Pu, Hui Mao
[pdf]
[DOI]

Object Detection As Probabilistic Set Prediction
Georg Hess, Christoffer Petersson, Lennart Svensson
[pdf]
[DOI]

Weakly-Supervised Temporal Action Detection for Fine-Grained Videos with Hierarchical Atomic Actions
Zhi Li, Lu He, Huijuan Xu
[pdf]
[DOI]

Neural Correspondence Field for Object Pose Estimation
Lin Huang, Tomas Hodan, Lingni Ma, Linguang Zhang, Luan Tran, Christopher D. Twigg, Po-Chen Wu, Junsong Yuan, Cem Keskin, Robert Wang
[pdf]
[DOI]

On Label Granularity and Object Localization
Elijah Cole, Kimberly Wilber, Grant Van Horn, Xuan Yang, Marco Fornoni, Pietro Perona, Serge Belongie, Andrew Howard, Oisin Mac Aodha
[pdf]
[DOI]

OIMNet++: Prototypical Normalization and Localization-Aware Learning for Person Search
Sanghoon Lee, Youngmin Oh, Donghyeon Baek, Junghyup Lee, Bumsub Ham
[pdf]
[DOI]

Out-of-Distribution Identification: Let Detector Tell Which I Am Not Sure
Ruoqi Li, Chongyang Zhang, Hao Zhou, Chao Shi, Yan Luo
[pdf]
[DOI]

Learning with Free Object Segments for Long-Tailed Instance Segmentation
Cheng Zhang, Tai-Yu Pan, Tianle Chen, Jike Zhong, Wenjin Fu, Wei-Lun Chao
[pdf]
[DOI]

Autoregressive Uncertainty Modeling for 3D Bounding Box Prediction
YuXuan Liu, Nikhil Mishra, Maximilian Sieb, Yide Shentu, Pieter Abbeel, Xi Chen
[pdf]
[DOI]

3D Random Occlusion and Multi-layer Projection for Deep Multi-Camera Pedestrian Localization
Rui Qiu, Ming Xu, Yuyao Yan, Jeremy S. Smith, Xi Yang
[pdf]
[DOI]

A Simple Single-Scale Vision Transformer for Object Detection and Instance Segmentation
Wuyang Chen, Xianzhi Du, Fan Yang, Lucas Beyer, Xiaohua Zhai, Tsung-Yi Lin, Huizhong Chen, Jing Li, Xiaodan Song, Zhangyang Wang, Denny Zhou
[pdf]
[DOI]

Simple Open-Vocabulary Object Detection with Vision Transformers
Matthias Minderer, Alexey Gritsenko, Austin Stone, Maxim Neumann, Dirk Weissenborn, Alexey Dosovitskiy, Aravindh Mahendran, Anurag Arnab, Mostafa Dehghani, Zhuoran Shen, Xiao Wang, Xiaohua Zhai, Thomas Kipf, Neil Houlsby
[pdf]
[DOI]

"A Simple Approach and Benchmark for 21,000-Category Object Detection"
Yutong Lin, Chen Li, Yue Cao, Zheng Zhang, Jianfeng Wang, Lijuan Wang, Zicheng Liu, Han Hu
[pdf]
[DOI]

Knowledge Condensation Distillation
Chenxin Li, Mingbao Lin, Zhiyuan Ding, Nie Lin, Yihong Zhuang, Yue Huang, Xinghao Ding, Liujuan Cao
[pdf]
[DOI]

Reducing Information Loss for Spiking Neural Networks
Yufei Guo, Yuanpei Chen, Liwen Zhang, YingLei Wang, Xiaode Liu, Xinyi Tong, Yuanyuan Ou, Xuhui Huang, Zhe Ma
[pdf]
[DOI]

Masked Generative Distillation
Zhendong Yang, Zhe Li, Mingqi Shao, Dachuan Shi, Zehuan Yuan, Chun Yuan
[pdf]
[DOI]

Fine-Grained Data Distribution Alignment for Post-Training Quantization
Yunshan Zhong, Mingbao Lin, Mengzhao Chen, Ke Li, Yunhang Shen, Fei Chao, Yongjian Wu, Rongrong Ji
[pdf]
[DOI]

Learning with Recoverable Forgetting
Jingwen Ye, Yifang Fu, Jie Song, Xingyi Yang, Songhua Liu, Xin Jin, Mingli Song, Xinchao Wang
[pdf]
[DOI]

Efficient One Pass Self-Distillation with Zipf’s Label Smoothing
Jiajun Liang, Linze Li, Zhaodong Bing, Borui Zhao, Yao Tang, Bo Lin, Haoqiang Fan
[pdf]
[DOI]

Prune Your Model before Distill It
Jinhyuk Park, Albert No
[pdf]
[DOI]

Deep Partial Updating: Towards Communication Efficient Updating for On-Device Inference
Zhongnan Qu, Cong Liu, Lothar Thiele
[pdf]
[DOI]

Patch Similarity Aware Data-Free Quantization for Vision Transformers
Zhikai Li, Liping Ma, Mengjuan Chen, Junrui Xiao, Qingyi Gu
[pdf]
[DOI]

"L3: Accelerator-Friendly Lossless Image Format for High-Resolution, High-Throughput DNN Training"
Jonghyun Bae, Woohyeon Baek, Tae Jun Ham, Jae W. Lee
[pdf]
[DOI]

Streaming Multiscale Deep Equilibrium Models
Can Ufuk Ertenli, Emre Akbas, Ramazan Gokberk Cinbis
[pdf]
[DOI]

Symmetry Regularization and Saturating Nonlinearity for Robust Quantization
Sein Park, Yeongsang Jang, Eunhyeok Park
[pdf]
[DOI]

SP-Net: Slowly Progressing Dynamic Inference Networks
Huanyu Wang, Wenhu Zhang, Shihao Su, Hui Wang, Zhenwei Miao, Xin Zhan, Xi Li
[pdf]
[DOI]

Equivariance and Invariance Inductive Bias for Learning from Insufficient Data
Tan Wang, Qianru Sun, Sugiri Pranata, Karlekar Jayashree, Hanwang Zhang
[pdf]
[DOI]

Mixed-Precision Neural Network Quantization via Learned Layer-Wise Importance
Chen Tang, Kai Ouyang, Zhi Wang, Yifei Zhu, Wen Ji, Yaowei Wang, Wenwu Zhu
[pdf]
[DOI]

Event Neural Networks
Matthew Dutson, Yin Li, Mohit Gupta
[pdf]
[DOI]

EdgeViTs: Competing Light-Weight CNNs on Mobile Devices with Vision Transformers
Junting Pan, Adrian Bulat, Fuwen Tan, Xiatian Zhu, Lukasz Dudziak, Hongsheng Li, Georgios Tzimiropoulos, Brais Martinez
[pdf]
[DOI]

PalQuant: Accelerating High-Precision Networks on Low-Precision Accelerators
Qinghao Hu, Gang Li, Qiman Wu, Jian Cheng
[pdf]
[DOI]

Disentangled Differentiable Network Pruning
Shangqian Gao, Feihu Huang, Yanfu Zhang, Heng Huang
[pdf]
[DOI]

IDa-Det: An Information Discrepancy-Aware Distillation for 1-Bit Detectors
Sheng Xu, Yanjing Li, Bohan Zeng, Teli Ma, Baochang Zhang, Xianbin Cao, Peng Gao, Jinhu Lü
[pdf]
[DOI]

Learning to Weight Samples for Dynamic Early-Exiting Networks
Yizeng Han, Yifan Pu, Zihang Lai, Chaofei Wang, Shiji Song, Junfeng Cao, Wenhui Huang, Chao Deng, Gao Huang
[pdf]
[DOI]

AdaBin: Improving Binary Neural Networks with Adaptive Binary Sets
Zhijun Tu, Xinghao Chen, Pengju Ren, Yunhe Wang
[pdf]
[DOI]

Adaptive Token Sampling for Efficient Vision Transformers
Mohsen Fayyaz, Soroush Abbasi Koohpayegani, Farnoush Rezaei Jafari, Sunando Sengupta, Hamid Reza Vaezi Joze, Eric Sommerlade, Hamed Pirsiavash, Jürgen Gall
[pdf]
[DOI]

Weight Fixing Networks
Christopher Subia-Waud, Srinandan Dasmahapatra
[pdf]
[DOI]

Self-Slimmed Vision Transformer
Zhuofan Zong, Kunchang Li, Guanglu Song, Yali Wang, Yu Qiao, Biao Leng, Yu Liu
[pdf]
[DOI]

Switchable Online Knowledge Distillation
Biao Qian, Yang Wang, Hongzhi Yin, Richang Hong, Meng Wang
[pdf]
[DOI]

l∞-Robustness and Beyond: Unleashing Efficient Adversarial Training
Hadi M. Dolatabadi, Sarah Erfani, Christopher Leckie
[pdf]
[DOI]

Multi-Granularity Pruning for Model Acceleration on Mobile Devices
Tianli Zhao, Xi Sheryl Zhang, Wentao Zhu, Jiaxing Wang, Sen Yang, Ji Liu, Jian Cheng
[pdf]
[DOI]

Deep Ensemble Learning by Diverse Knowledge Distillation for Fine-Grained Object Classification
Naoki Okamoto, Tsubasa Hirakawa, Takayoshi Yamashita, Hironobu Fujiyoshi
[pdf]
[DOI]

Helpful or Harmful: Inter-Task Association in Continual Learning
Hyundong Jin, Eunwoo Kim
[pdf]
[DOI]

Towards Accurate Binary Neural Networks via Modeling Contextual Dependencies
Xingrun Xing, Yangguang Li, Wei Li, Wenrui Ding, Yalong Jiang, Yufeng Wang, Jing Shao, Chunlei Liu, Xianglong Liu
[pdf]
[DOI]

SPIN: An Empirical Evaluation on Sharing Parameters of Isotropic Networks
Chien-Yu Lin, Anish Prabhu, Thomas Merth, Sachin Mehta, Anurag Ranjan, Maxwell Horton, Mohammad Rastegari
[pdf]
[DOI]

Ensemble Knowledge Guided Sub-network Search and Fine-Tuning for Filter Pruning
Seunghyun Lee, Byung Cheol Song
[pdf]
[DOI]

Network Binarization via Contrastive Learning
Yuzhang Shang, Dan Xu, Ziliang Zong, Liqiang Nie, Yan Yan
[pdf]
[DOI]

Lipschitz Continuity Retained Binary Neural Network
Yuzhang Shang, Dan Xu, Bin Duan, Ziliang Zong, Liqiang Nie, Yan Yan
[pdf]
[DOI]

SPViT: Enabling Faster Vision Transformers via Latency-Aware Soft Token Pruning
Zhenglun Kong, Peiyan Dong, Xiaolong Ma, Xin Meng, Wei Niu, Mengshu Sun, Xuan Shen, Geng Yuan, Bin Ren, Hao Tang, Minghai Qin, Yanzhi Wang
[pdf]
[DOI]

Soft Masking for Cost-Constrained Channel Pruning
Ryan Humble, Maying Shen, Jorge Albericio Latorre, Eric Darve, Jose Alvarez
[pdf]
[DOI]

Non-uniform Step Size Quantization for Accurate Post-Training Quantization
Sangyun Oh, Hyeonuk Sim, Jounghyun Kim, Jongeun Lee
[pdf]
[DOI]

SuperTickets: Drawing Task-Agnostic Lottery Tickets from Supernets via Jointly Architecture Searching and Parameter Pruning
Haoran You, Baopu Li, Zhanyi Sun, Xu Ouyang, Yingyan Lin
[pdf]
[DOI]

Meta-GF: Training Dynamic-Depth Neural Networks Harmoniously
Yi Sun, Jian Li, Xin Xu
[pdf]
[DOI]

Towards Ultra Low Latency Spiking Neural Networks for Vision and Sequential Tasks Using Temporal Pruning
Sayeed Shafayet Chowdhury, Nitin Rathi, Kaushik Roy
[pdf]
[DOI]

Towards Accurate Network Quantization with Equivalent Smooth Regularizer
Kirill Solodskikh, Vladimir Chikin, Ruslan Aydarkhanov, Dehua Song, Irina Zhelavskaya, Jiansheng Wei
[pdf]
[DOI]

Explicit Model Size Control and Relaxation via Smooth Regularization for Mixed-Precision Quantization
Vladimir Chikin, Kirill Solodskikh, Irina Zhelavskaya
[pdf]
[DOI]

BASQ: Branch-Wise Activation-Clipping Search Quantization for Sub-4-Bit Neural Networks
Han-Byul Kim, Eunhyeok Park, Sungjoo Yoo
[pdf]
[DOI]

You Already Have It: A Generator-Free Low-Precision DNN Training Framework Using Stochastic Rounding
Geng Yuan, Sung-En Chang, Qing Jin, Alec Lu, Yanyu Li, Yushu Wu, Zhenglun Kong, Yanyue Xie, Peiyan Dong, Minghai Qin, Xiaolong Ma, Xulong Tang, Zhenman Fang, Yanzhi Wang
[pdf]
[DOI]

Real Spike: Learning Real-Valued Spikes for Spiking Neural Networks
Yufei Guo, Liwen Zhang, Yuanpei Chen, Xinyi Tong, Xiaode Liu, YingLei Wang, Xuhui Huang, Zhe Ma
[pdf]
[DOI]

FedLTN: Federated Learning for Sparse and Personalized Lottery Ticket Networks
Vaikkunth Mugunthan, Eric Lin, Vignesh Gokul, Christian Lau, Lalana Kagal, Steve Pieper
[pdf]
[DOI]

Theoretical Understanding of the Information Flow on Continual Learning Performance
Joshua Andle, Salimeh Yasaei Sekeh
[pdf]
[DOI]

Exploring Lottery Ticket Hypothesis in Spiking Neural Networks
Youngeun Kim, Yuhang Li, Hyoungseob Park, Yeshwanth Venkatesha, Ruokai Yin, Priyadarshini Panda
[pdf]
[DOI]

On the Angular Update and Hyperparameter Tuning of a Scale-Invariant Network
Juseung Yun, Janghyeon Lee, Hyounguk Shon, Eojindl Yi, Seung Hwan Kim, Junmo Kim
[pdf]
[DOI]

LANA: Latency Aware Network Acceleration
Pavlo Molchanov, Jimmy Hall, Hongxu Yin, Jan Kautz, Nicolo Fusi, Arash Vahdat
[pdf]
[DOI]

RDO-Q: Extremely Fine-Grained Channel-Wise Quantization via Rate-Distortion Optimization
Zhe Wang, Jie Lin, Xue Geng, Mohamed M. Sabry Aly, Vijay Chandrasekhar
[pdf]
[DOI]

U-Boost NAS: Utilization-Boosted Differentiable Neural Architecture Search
Ahmet Caner Yüzügüler, Nikolaos Dimitriadis, Pascal Frossard
[pdf]
[DOI]

PTQ4ViT: Post-Training Quantization for Vision Transformers with Twin Uniform Quantization
Zhihang Yuan, Chenhao Xue, Yiqi Chen, Qiang Wu, Guangyu Sun
[pdf]
[DOI]

Bitwidth-Adaptive Quantization-Aware Neural Network Training: A Meta-Learning Approach
Jiseok Youn, Jaehun Song, Hyung-Sin Kim, Saewoong Bahk
[pdf]
[DOI]

Understanding the Dynamics of DNNs Using Graph Modularity
Yao Lu, Wen Yang, Yunzhe Zhang, Zuohui Chen, Jinyin Chen, Qi Xuan, Zhen Wang, Xiaoniu Yang
[pdf]
[DOI]

Latent Discriminant Deterministic Uncertainty
Gianni Franchi, Xuanlong Yu, Andrei Bursuc, Emanuel Aldea, Severine Dubuisson, David Filliat
[pdf]
[DOI]

Making Heads or Tails: Towards Semantically Consistent Visual Counterfactuals
Simon Vandenhende, Dhruv Mahajan, Filip Radenovic, Deepti Ghadiyaram
[pdf]
[DOI]

HIVE: Evaluating the Human Interpretability of Visual Explanations
Sunnie S. Y. Kim, Nicole Meister, Vikram V. Ramaswamy, Ruth Fong, Olga Russakovsky
[pdf]
[DOI]

BayesCap: Bayesian Identity Cap for Calibrated Uncertainty in Frozen Neural Networks
Uddeshya Upadhyay, Shyamgopal Karthik, Yanbei Chen, Massimiliano Mancini, Zeynep Akata
[pdf]
[DOI]

SESS: Saliency Enhancing with Scaling and Sliding
Osman Tursun, Simon Denman, Sridha Sridharan, Clinton Fookes
[pdf]
[DOI]

No Token Left Behind: Explainability-Aided Image Classification and Generation
Roni Paiss, Hila Chefer, Lior Wolf
[pdf]
[DOI]

Interpretable Image Classification with Differentiable Prototypes Assignment
Dawid Rymarczyk, Łukasz Struski, Michał Górszczak, Koryna Lewandowska, Jacek Tabor, Bartosz Zieliński
[pdf]
[DOI]

"Contributions of Shape, Texture, and Color in Visual Recognition"
Yunhao Ge, Yao Xiao, Zhi Xu, Xingrui Wang, Laurent Itti
[pdf]
[DOI]

STEEX: Steering Counterfactual Explanations with Semantics
Paul Jacob, Éloi Zablocki, Hédi Ben-Younes, Mickaël Chen, Patrick Pérez, Matthieu Cord
[pdf]
[DOI]

Are Vision Transformers Robust to Patch Perturbations?
Jindong Gu, Volker Tresp, Yao Qin
[pdf]
[DOI]

A Dataset Generation Framework for Evaluating Megapixel Image Classifiers & Their Explanations
Gautam Machiraju, Sylvia Plevritis, Parag Mallick
[pdf]
[DOI]

Cartoon Explanations of Image Classifiers
Stefan Kolek, Duc Anh Nguyen, Ron Levie, Joan Bruna, Gitta Kutyniok
[pdf]
[DOI]

Shap-CAM: Visual Explanations for Convolutional Neural Networks Based on Shapley Value
Quan Zheng, Ziwei Wang, Jie Zhou, Jiwen Lu
[pdf]
[DOI]

Privacy-Preserving Face Recognition with Learnable Privacy Budgets in Frequency Domain
Jiazhen Ji, Huan Wang, Yuge Huang, Jiaxiang Wu, Xingkun Xu, Shouhong Ding, ShengChuan Zhang, Liujuan Cao, Rongrong Ji
[pdf]
[DOI]

Contrast-Phys: Unsupervised Video-Based Remote Physiological Measurement via Spatiotemporal Contrast
Zhaodong Sun, Xiaobai Li
[pdf]
[DOI]

Source-Free Domain Adaptation with Contrastive Domain Alignment and Self-Supervised Exploration for Face Anti-Spoofing
Yuchen Liu, Yabo Chen, Wenrui Dai, Mengran Gou, Chun-Ting Huang, Hongkai Xiong
[pdf]
[DOI]

On Mitigating Hard Clusters for Face Clustering
Yingjie Chen, Huasong Zhong, Chong Chen, Chen Shen, Jianqiang Huang, Tao Wang, Yun Liang, Qianru Sun
[pdf]
[DOI]

OneFace: One Threshold for All
Jiaheng Liu, Zhipeng Yu, Haoyu Qin, Yichao Wu, Ding Liang, Gangming Zhao, Ke Xu
[pdf]
[DOI]

Label2Label: A Language Modeling Framework for Multi-Attribute Learning
Wanhua Li, Zhexuan Cao, Jianjiang Feng, Jie Zhou, Jiwen Lu
[pdf]
[DOI]

AgeTransGAN for Facial Age Transformation with Rectified Performance Metrics
Gee-Sern Hsu, Rui-Cang Xie, Zhi-Ting Chen, Yu-Hong Lin
[pdf]
[DOI]

Hierarchical Contrastive Inconsistency Learning for Deepfake Video Detection
Zhihao Gu, Taiping Yao, Yang Chen, Shouhong Ding, Lizhuang Ma
[pdf]
[DOI]

Rethinking Robust Representation Learning under Fine-Grained Noisy Faces
Bingqi Ma, Guanglu Song, Boxiao Liu, Yu Liu
[pdf]
[DOI]

Teaching Where to Look: Attention Similarity Knowledge Distillation for Low Resolution Face Recognition
Sungho Shin, Joosoon Lee, Junseok Lee, Yeonguk Yu, Kyoobin Lee
[pdf]
[DOI]

Teaching with Soft Label Smoothing for Mitigating Noisy Labels in Facial Expressions
Tohar Lukov, Na Zhao, Gim Hee Lee, Ser-Nam Lim
[pdf]
[DOI]

Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head Synthesis
Shuai Shen, Wanhua Li, Zheng Zhu, Yueqi Duan, Jie Zhou, Jiwen Lu
[pdf]
[DOI]

CoupleFace: Relation Matters for Face Recognition Distillation
Jiaheng Liu, Haoyu Qin, Yichao Wu, Jinyang Guo, Ding Liang, Ke Xu
[pdf]
[DOI]

Controllable and Guided Face Synthesis for Unconstrained Face Recognition
Feng Liu, Minchul Kim, Anil Jain, Xiaoming Liu
[pdf]
[DOI]

Towards Robust Face Recognition with Comprehensive Search
Manyuan Zhang, Guanglu Song, Yu Liu, Hongsheng Li
[pdf]
[DOI]

Towards Unbiased Label Distribution Learning for Facial Pose Estimation Using Anisotropic Spherical Gaussian
Zhiwen Cao, Dongfang Liu, Qifan Wang, Yingjie Chen
[pdf]
[DOI]

AU-Aware 3D Face Reconstruction through Personalized AU-Specific Blendshape Learning
Chenyi Kuang, Zijun Cui, Jeffrey O. Kephart, Qiang Ji
[pdf]
[DOI]

BézierPalm: A Free Lunch for Palmprint Recognition
Kai Zhao, Lei Shen, Yingyi Zhang, Chuhan Zhou, Tao Wang, Ruixin Zhang, Shouhong Ding, Wei Jia, Wei Shen
[pdf]
[DOI]

Adaptive Transformers for Robust Few-Shot Cross-Domain Face Anti-Spoofing
Hsin-Ping Huang, Deqing Sun, Yaojie Liu, Wen-Sheng Chu, Taihong Xiao, Jinwei Yuan, Hartwig Adam, Ming-Hsuan Yang
[pdf]
[DOI]

Face2Faceρ: Real-Time High-Resolution One-Shot Face Reenactment
Kewei Yang, Kang Chen, Daoliang Guo, Song-Hai Zhang, Yuan-Chen Guo, Weidong Zhang
[pdf]
[DOI]

Towards Racially Unbiased Skin Tone Estimation via Scene Disambiguation
Haiwen Feng, Timo Bolkart, Joachim Tesch, Michael J. Black, Victoria Abrevaya
[pdf]
[DOI]

BoundaryFace: A Mining Framework with Noise Label Self-Correction for Face Recognition
Shijie Wu, Xun Gong
[pdf]
[DOI]

Pre-training Strategies and Datasets for Facial Representation Learning
Adrian Bulat, Shiyang Cheng, Jing Yang, Andrew Garbett, Enrique Sanchez, Georgios Tzimiropoulos
[pdf]
[DOI]

Look Both Ways: Self-Supervising Driver Gaze Estimation and Road Scene Saliency
Isaac Kasahara, Simon Stent, Hyun Soo Park
[pdf]
[DOI]

MFIM: Megapixel Facial Identity Manipulation
Sanghyeon Na
[pdf]
[DOI]

3D Face Reconstruction with Dense Landmarks
Erroll Wood, Tadas Baltrušaitis, Charlie Hewitt, Matthew Johnson, Jingjing Shen, Nikola Milosavljević, Daniel Wilde, Stephan Garbin, Toby Sharp, Ivan Stojiljković, Tom Cashman, Julien Valentin
[pdf]
[DOI]

Emotion-Aware Multi-View Contrastive Learning for Facial Emotion Recognition
Daeha Kim, Byung Cheol Song
[pdf]
[DOI]

Order Learning Using Partially Ordered Data via Chainization
Seon-Ho Lee, Chang-Su Kim
[pdf]
[DOI]

Unsupervised High-Fidelity Facial Texture Generation and Reconstruction
Ron Slossberg, Ibrahim Jubran, Ron Kimmel
[pdf]
[DOI]

Multi-Domain Learning for Updating Face Anti-Spoofing Models
Xiao Guo, Yaojie Liu, Anil Jain, Xiaoming Liu
[pdf]
[DOI]

Towards Metrical Reconstruction of Human Faces
Wojciech Zielonka, Timo Bolkart, Justus Thies
[pdf]
[DOI]

Discover and Mitigate Unknown Biases with Debiasing Alternate Networks
Zhiheng Li, Anthony Hoogs, Chenliang Xu
[pdf]
[DOI]

Unsupervised and Semi-Supervised Bias Benchmarking in Face Recognition
Alexandra Chouldechova, Siqi Deng, Yongxin Wang, Wei Xia, Pietro Perona
[pdf]
[DOI]

Towards Efficient Adversarial Training on Vision Transformers
Boxi Wu, Jindong Gu, Zhifeng Li, Deng Cai, Xiaofei He, Wei Liu
[pdf]
[DOI]

MIME: Minority Inclusion for Majority Group Enhancement of AI Performance
Pradyumna Chari, Yunhao Ba, Shreeram Athreya, Achuta Kadambi
[pdf]
[DOI]

Studying Bias in GANs through the Lens of Race
Vongani H. Maluleke, Neerja Thakkar, Tim Brooks, Ethan Weber, Trevor Darrell, Alexei A. Efros, Angjoo Kanazawa, Devin Guillory
[pdf]
[DOI]

"Trust, but Verify: Using Self-Supervised Probing to Improve Trustworthiness"
Ailin Deng, Shen Li, Miao Xiong, Zhirui Chen, Bryan Hooi
[pdf]
[DOI]

Learning to Censor by Noisy Sampling
Ayush Chopra, Abhinav Java, Abhishek Singh, Vivek Sharma, Ramesh Raskar
[pdf]
[DOI]

An Invisible Black-Box Backdoor Attack through Frequency Domain
Tong Wang, Yuan Yao, Feng Xu, Shengwei An, Hanghang Tong, Ting Wang
[pdf]
[DOI]

FairGRAPE: Fairness-Aware GRAdient Pruning mEthod for Face Attribute Classification
Xiaofeng Lin, Seungbae Kim, Jungseock Joo
[pdf]
[DOI]

Attaining Class-Level Forgetting in Pretrained Model Using Few Samples
Pravendra Singh, Pratik Mazumder, Mohammed Asad Karim
[pdf]
[DOI]

Anti-Neuron Watermarking: Protecting Personal Data against Unauthorized Neural Networks
Zihang Zou, Boqing Gong, Liqiang Wang
[pdf]
[DOI]

An Impartial Take to the CNN vs Transformer Robustness Contest
Francesco Pinto, Philip H. S. Torr, Puneet K. Dokania
[pdf]
[DOI]

Recover Fair Deep Classification Models via Altering Pre-trained Structure
Yanfu Zhang, Shangqian Gao, Heng Huang
[pdf]
[DOI]

Decouple-and-Sample: Protecting Sensitive Information in Task Agnostic Data Release
Abhishek Singh, Ethan Garza, Ayush Chopra, Praneeth Vepakomma, Vivek Sharma, Ramesh Raskar
[pdf]
[DOI]

Privacy-Preserving Action Recognition via Motion Difference Quantization
Sudhakar Kumawat, Hajime Nagahara
[pdf]
[DOI]

Latent Space Smoothing for Individually Fair Representations
Momchil Peychev, Anian Ruoss, Mislav Balunović, Maximilian Baader, Martin Vechev
[pdf]
[DOI]

Parameterized Temperature Scaling for Boosting the Expressive Power in Post-Hoc Uncertainty Calibration
Christian Tomani, Daniel Cremers, Florian Buettner
[pdf]
[DOI]

FairStyle: Debiasing StyleGAN2 with Style Channel Manipulations
Cemre Efe Karakas, Alara Dirik, Eylül Yalçınkaya, Pinar Yanardag
[pdf]
[DOI]

Distilling the Undistillable: Learning from a Nasty Teacher
Surgan Jandial, Yash Khasbage, Arghya Pal, Vineeth N Balasubramanian, Balaji Krishnamurthy
[pdf]
[DOI]

SOS! Self-Supervised Learning over Sets of Handled Objects in Egocentric Action Recognition
Victor Escorcia, Ricardo Guerrero, Xiatian Zhu, Brais Martinez
[pdf]
[DOI]

Egocentric Activity Recognition and Localization on a 3D Map
Miao Liu, Lingni Ma, Kiran Somasundaram, Yin Li, Kristen Grauman, James M. Rehg, Chao Li
[pdf]
[DOI]

Generative Adversarial Network for Future Hand Segmentation from Egocentric Video
Wenqi Jia, Miao Liu, James M. Rehg
[pdf]
[DOI]

My View Is the Best View: Procedure Learning from Egocentric Videos
Siddhant Bansal, Chetan Arora, C.V. Jawahar
[pdf]
[DOI]

GIMO: Gaze-Informed Human Motion Prediction in Context
Yang Zheng, Yanchao Yang, Kaichun Mo, Jiaman Li, Tao Yu, Yebin Liu, Karen Liu, Leonidas J. Guibas
[pdf]
[DOI]

Image-Based CLIP-Guided Essence Transfer
Hila Chefer, Sagie Benaim, Roni Paiss, Lior Wolf
[pdf]
[DOI]

Detecting and Recovering Sequential DeepFake Manipulation
Rui Shao, Tianxing Wu, Ziwei Liu
[pdf]
[DOI]

Self-Supervised Sparse Representation for Video Anomaly Detection
Jhih-Ciang Wu, He-Yen Hsieh, Ding-Jie Chen, Chiou-Shann Fuh, Tyng-Luh Liu
[pdf]
[DOI]

Watermark Vaccine: Adversarial Attacks to Prevent Watermark Removal
Xinwei Liu, Jian Liu, Yang Bai, Jindong Gu, Tao Chen, Xiaojun Jia, Xiaochun Cao
[pdf]
[DOI]

Explaining Deepfake Detection by Analysing Image Matching
Shichao Dong, Jin Wang, Jiajun Liang, Haoqiang Fan, Renhe Ji
[pdf]
[DOI]

FrequencyLowCut Pooling – Plug & Play against Catastrophic Overfitting
Julia Grabinski, Steffen Jung, Janis Keuper, Margret Keuper
[pdf]
[DOI]

TAFIM: Targeted Adversarial Attacks against Facial Image Manipulations
Shivangi Aneja, Lev Markhasin, Matthias Nießner
[pdf]
[DOI]

FingerprintNet: Synthesized Fingerprints for Generated Image Detection
Yonghyun Jeong, Doyeon Kim, Youngmin Ro, Pyounggeon Kim, Jongwon Choi
[pdf]
[DOI]

Detecting Generated Images by Real Images
Bo Liu, Fan Yang, Xiuli Bi, Bin Xiao, Weisheng Li, Xinbo Gao
[pdf]
[DOI]

An Information Theoretic Approach for Attention-Driven Face Forgery Detection
Ke Sun, Hong Liu, Taiping Yao, Xiaoshuai Sun, Shen Chen, Shouhong Ding, Rongrong Ji
[pdf]
[DOI]

Exploring Disentangled Content Information for Face Forgery Detection
Jiahao Liang, Huafeng Shi, Weihong Deng
[pdf]
[DOI]

RepMix: Representation Mixing for Robust Attribution of Synthesized Images
Tu Bui, Ning Yu, John Collomosse
[pdf]
[DOI]

Totems: Physical Objects for Verifying Visual Integrity
Jingwei Ma, Lucy Chai, Minyoung Huh, Tongzhou Wang, Ser-Nam Lim, Phillip Isola, Antonio Torralba
[pdf]
[DOI]

Dual-Stream Knowledge-Preserving Hashing for Unsupervised Video Retrieval
Pandeng Li, Hongtao Xie, Jiannan Ge, Lei Zhang, Shaobo Min, Yongdong Zhang
[pdf]
[DOI]

PASS: Part-Aware Self-Supervised Pre-training for Person Re-identification
Kuan Zhu, Haiyun Guo, Tianyi Yan, Yousong Zhu, Jinqiao Wang, Ming Tang
[pdf]
[DOI]

Adaptive Cross-Domain Learning for Generalizable Person Re-identification
Pengyi Zhang, Huanzhang Dou, Yunlong Yu, Xi Li
[pdf]
[DOI]

Multi-Query Video Retrieval
Zeyu Wang, Yu Wu, Karthik Narasimhan, Olga Russakovsky
[pdf]
[DOI]

Hierarchical Average Precision Training for Pertinent Image Retrieval
Elias Ramzi, Nicolas Audebert, Nicolas Thome, Clément Rambour, Xavier Bitot
[pdf]
[DOI]

Learning Semantic Correspondence with Sparse Annotations
Shuaiyi Huang, Luyu Yang, Bo He, Songyang Zhang, Xuming He, Abhinav Shrivastava
[pdf]
[DOI]

Dynamically Transformed Instance Normalization Network for Generalizable Person Re-identification
Bingliang Jiao, Lingqiao Liu, Liying Gao, Guosheng Lin, Lu Yang, Shizhou Zhang, Peng Wang, Yanning Zhang
[pdf]
[DOI]

Domain Adaptive Person Search
Junjie Li, Yichao Yan, Guanshuo Wang, Fufu Yu, Qiong Jia, Shouhong Ding
[pdf]
[DOI]

TS2-Net: Token Shift and Selection Transformer for Text-Video Retrieval
Yuqi Liu, Pengfei Xiong, Luhui Xu, Shengming Cao, Qin Jin
[pdf]
[DOI]

Unstructured Feature Decoupling for Vehicle Re-identification
Wen Qian, Hao Luo, Silong Peng, Fan Wang, Chen Chen, Hao Li
[pdf]
[DOI]

Deep Hash Distillation for Image Retrieval
Young Kyun Jang, Geonmo Gu, Byungsoo Ko, Isaac Kang, Nam Ik Cho
[pdf]
[DOI]

Mimic Embedding via Adaptive Aggregation: Learning Generalizable Person Re-identification
Boqiang Xu, Jian Liang, Lingxiao He, Zhenan Sun
[pdf]
[DOI]

Granularity-Aware Adaptation for Image Retrieval over Multiple Tasks
Jon Almazán, Byungsoo Ko, Geonmo Gu, Diane Larlus, Yannis Kalantidis
[pdf]
[DOI]

Learning Audio-Video Modalities from Image Captions
Arsha Nagrani, Paul Hongsuck Seo, Bryan Seybold, Anja Hauth, Santiago Manen, Chen Sun, Cordelia Schmid
[pdf]
[DOI]

RVSL: Robust Vehicle Similarity Learning in Real Hazy Scenes Based on Semi-Supervised Learning
Wei-Ting Chen, I-Hsiang Chen, Chih-Yuan Yeh, Hao-Hsiang Yang, Hua-En Chang, Jian-Jiun Ding, Sy-Yen Kuo
[pdf]
[DOI]

Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video Retrieval
Fan Hu, Aozhu Chen, Ziyue Wang, Fangming Zhou, Jianfeng Dong, Xirong Li
[pdf]
[DOI]

Modality Synergy Complement Learning with Cascaded Aggregation for Visible-Infrared Person Re-identification
Yiyuan Zhang, Sanyuan Zhao, Yuhao Kang, Jianbing Shen
[pdf]
[DOI]

Cross-Modality Transformer for Visible-Infrared Person Re-identification
Kongzhu Jiang, Tianzhu Zhang, Xiang Liu, Bingqiao Qian, Yongdong Zhang, Feng Wu
[pdf]
[DOI]

Audio-Visual Mismatch-Aware Video Retrieval via Association and Adjustment
Sangmin Lee, Sungjune Park, Yong Man Ro
[pdf]
[DOI]

Connecting Compression Spaces with Transformer for Approximate Nearest Neighbor Search
Haokui Zhang, Buzhou Tang, Wenze Hu, Xiaoyu Wang
[pdf]
[DOI]

SEMICON: A Learning-to-Hash Solution for Large-Scale Fine-Grained Image Retrieval
Yang Shen, Xuhao Sun, Xiu-Shen Wei, Qing-Yuan Jiang, Jian Yang
[pdf]
[DOI]

CAViT: Contextual Alignment Vision Transformer for Video Object Re-identification
Jinlin Wu, Lingxiao He, Wu Liu, Yang Yang, Zhen Lei, Tao Mei, Stan Z. Li
[pdf]
[DOI]

Text-Based Temporal Localization of Novel Events
Sudipta Paul, Niluthpol Chowdhury Mithun, Amit K. Roy-Chowdhury
[pdf]
[DOI]

Reliability-Aware Prediction via Uncertainty Learning for Person Image Retrieval
Zhaopeng Dou, Zhongdao Wang, Weihua Chen, Yali Li, Shengjin Wang
[pdf]
[DOI]

Relighting4D: Neural Relightable Human from Videos
Zhaoxi Chen, Ziwei Liu
[pdf]
[DOI]

Real-Time Intermediate Flow Estimation for Video Frame Interpolation
Zhewei Huang, Tianyuan Zhang, Wen Heng, Boxin Shi, Shuchang Zhou
[pdf]
[DOI]

PixelFolder: An Efficient Progressive Pixel Synthesis Network for Image Generation
Jing He, Yiyi Zhou, Qi Zhang, Jun Peng, Yunhang Shen, Xiaoshuai Sun, Chao Chen, Rongrong Ji
[pdf]
[DOI]

StyleSwap: Style-Based Generator Empowers Robust Face Swapping
Zhiliang Xu, Hang Zhou, Zhibin Hong, Ziwei Liu, Jiaming Liu, Zhizhi Guo, Junyu Han, Jingtuo Liu, Errui Ding, Jingdong Wang
[pdf]
[DOI]

Paint2Pix: Interactive Painting Based Progressive Image Synthesis and Editing
Jaskirat Singh, Liang Zheng, Cameron Smith, Jose Echevarria
[pdf]
[DOI]

FurryGAN: High Quality Foreground-Aware Image Synthesis
Jeongmin Bae, Mingi Kwon, Youngjung Uh
[pdf]
[DOI]

SCAM! Transferring Humans between Images with Semantic Cross Attention Modulation
Nicolas Dufour, David Picard, Vicky Kalogeiton
[pdf]
[DOI]

Sem2NeRF: Converting Single-View Semantic Masks to Neural Radiance Fields
Yuedong Chen, Qianyi Wu, Chuanxia Zheng, Tat-Jen Cham, Jianfei Cai
[pdf]
[DOI]

WaveGAN: Frequency-Aware GAN for High-Fidelity Few-Shot Image Generation
Mengping Yang, Zhe Wang, Ziqiu Chi, Wenyi Feng
[pdf]
[DOI]

End-to-End Visual Editing with a Generatively Pre-trained Artist
Andrew Brown, Cheng-Yang Fu, Omkar Parkhi, Tamara L. Berg, Andrea Vedaldi
[pdf]
[DOI]

High-Fidelity GAN Inversion with Padding Space
Qingyan Bai, Yinghao Xu, Jiapeng Zhu, Weihao Xia, Yujiu Yang, Yujun Shen
[pdf]
[DOI]

Designing One Unified Framework for High-Fidelity Face Reenactment and Swapping
Chao Xu, Jiangning Zhang, Yue Han, Guanzhong Tian, Xianfang Zeng, Ying Tai, Yabiao Wang, Chengjie Wang, Yong Liu
[pdf]
[DOI]

Sobolev Training for Implicit Neural Representations with Approximated Image Derivatives
Wentao Yuan, Qingtian Zhu, Xiangyue Liu, Yikang Ding, Haotian Zhang, Chi Zhang
[pdf]
[DOI]

Make-a-Scene: Scene-Based Text-to-Image Generation with Human Priors
Oran Gafni, Adam Polyak, Oron Ashual, Shelly Sheynin, Devi Parikh, Yaniv Taigman
[pdf]
[DOI]

3D-FM GAN: Towards 3D-Controllable Face Manipulation
Yuchen Liu, Zhixin Shu, Yijun Li, Zhe Lin, Richard Zhang, S.Y. Kung
[pdf]
[DOI]

Multi-Curve Translator for High-Resolution Photorealistic Image Translation
Yuda Song, Hui Qian, Xin Du
[pdf]
[DOI]

Deep Bayesian Video Frame Interpolation
Zhiyang Yu, Yu Zhang, Xujie Xiang, Dongqing Zou, Xijun Chen, Jimmy S. Ren
[pdf]
[DOI]

Cross Attention Based Style Distribution for Controllable Person Image Synthesis
Xinyue Zhou, Mingyu Yin, Xinyuan Chen, Li Sun, Changxin Gao, Qingli Li
[pdf]
[DOI]

KeypointNeRF: Generalizing Image-Based Volumetric Avatars Using Relative Spatial Encoding of Keypoints
Marko Mihajlovic, Aayush Bansal, Michael Zollhöfer, Siyu Tang, Shunsuke Saito
[pdf]
[DOI]

ViewFormer: NeRF-Free Neural Rendering from Few Images Using Transformers
Jonáš Kulhánek, Erik Derner, Torsten Sattler, Robert Babuška
[pdf]
[DOI]

L-Tracing: Fast Light Visibility Estimation on Neural Surfaces by Sphere Tracing
Ziyu Chen, Chenjing Ding, Jianfei Guo, Dongliang Wang, Yikang Li, Xuan Xiao, Wei Wu, Li Song
[pdf]
[DOI]

A Perceptual Quality Metric for Video Frame Interpolation
Qiqi Hou, Abhijay Ghildyal, Feng Liu
[pdf]
[DOI]

Adaptive Feature Interpolation for Low-Shot Image Generation
Mengyu Dai, Haibin Hang, Xiaoyang Guo
[pdf]
[DOI]

PalGAN: Image Colorization with Palette Generative Adversarial Networks
Yi Wang, Menghan Xia, Lu Qi, Jing Shao, Yu Qiao
[pdf]
[DOI]

Fast-Vid2Vid: Spatial-Temporal Compression for Video-to-Video Synthesis
Long Zhuo, Guangcong Wang, Shikai Li, Wayne Wu, Ziwei Liu
[pdf]
[DOI]

Learning Prior Feature and Attention Enhanced Image Inpainting
Chenjie Cao, Qiaole Dong, Yanwei Fu
[pdf]
[DOI]

Temporal-MPI: Enabling Multi-Plane Images for Dynamic Scene Modelling via Temporal Basis Learning
Wenpeng Xing, Jie Chen
[pdf]
[DOI]

3D-Aware Semantic-Guided Generative Model for Human Synthesis
Jichao Zhang, Enver Sangineto, Hao Tang, Aliaksandr Siarohin, Zhun Zhong, Nicu Sebe, Wei Wang
[pdf]
[DOI]

Temporally Consistent Semantic Video Editing
Yiran Xu, Badour AlBahar, Jia-Bin Huang
[pdf]
[DOI]

Error Compensation Framework for Flow-Guided Video Inpainting
Jaeyeon Kang, Seoung Wug Oh, Seon Joo Kim
[pdf]
[DOI]

Scraping Textures from Natural Images for Synthesis and Editing
Xueting Li, Xiaolong Wang, Ming-Hsuan Yang, Alexei A. Efros, Sifei Liu
[pdf]
[DOI]

Single Stage Virtual Try-On via Deformable Attention Flows
Shuai Bai, Huiling Zhou, Zhikang Li, Chang Zhou, Hongxia Yang
[pdf]
[DOI]

Improving GANs for Long-Tailed Data through Group Spectral Regularization
Harsh Rangwani, Naman Jaswani, Tejan Karmali, Varun Jampani, R. Venkatesh Babu
[pdf]
[DOI]

Hierarchical Semantic Regularization of Latent Spaces in StyleGANs
Tejan Karmali, Rishubh Parihar, Susmit Agrawal, Harsh Rangwani, Varun Jampani, Maneesh Singh, R. Venkatesh Babu
[pdf]
[DOI]

IntereStyle: Encoding an Interest Region for Robust StyleGAN Inversion
Seung-Jun Moon, Gyeong-Moon Park
[pdf]
[DOI]

StyleLight: HDR Panorama Generation for Lighting Estimation and Editing
Guangcong Wang, Yinuo Yang, Chen Change Loy, Ziwei Liu
[pdf]
[DOI]

Contrastive Monotonic Pixel-Level Modulation
Kun Lu, Rongpeng Li, Honggang Zhang
[pdf]
[DOI]

Learning Cross-Video Neural Representations for High-Quality Frame Interpolation
Wentao Shangguan, Yu Sun, Weijie Gan, Ulugbek S. Kamilov
[pdf]
[DOI]

Learning Continuous Implicit Representation for Near-Periodic Patterns
Bowei Chen, Tiancheng Zhi, Martial Hebert, Srinivasa G. Narasimhan
[pdf]
[DOI]

End-to-End Graph-Constrained Vectorized Floorplan Generation with Panoptic Refinement
Jiachen Liu, Yuan Xue, Jose Duarte, Krishnendra Shekhawat, Zihan Zhou, Xiaolei Huang
[pdf]
[DOI]

Few-Shot Image Generation with Mixup-Based Distance Learning
Chaerin Kong, Jeesoo Kim, Donghoon Han, Nojun Kwak
[pdf]
[DOI]

A Style-Based GAN Encoder for High Fidelity Reconstruction of Images and Videos
Xu Yao, Alasdair Newson, Yann Gousseau, Pierre Hellier
[pdf]
[DOI]

FakeCLR: Exploring Contrastive Learning for Solving Latent Discontinuity in Data-Efficient GANs
Ziqiang Li, Chaoyue Wang, Heliang Zheng, Jing Zhang, Bin Li
[pdf]
[DOI]

BlobGAN: Spatially Disentangled Scene Representations
Dave Epstein, Taesung Park, Richard Zhang, Eli Shechtman, Alexei A. Efros
[pdf]
[DOI]

Unified Implicit Neural Stylization
Zhiwen Fan, Yifan Jiang, Peihao Wang, Xinyu Gong, Dejia Xu, Zhangyang Wang
[pdf]
[DOI]

GAN with Multivariate Disentangling for Controllable Hair Editing
Xuyang Guo, Meina Kan, Tianle Chen, Shiguang Shan
[pdf]
[DOI]

Discovering Transferable Forensic Features for CNN-Generated Images Detection
Keshigeyan Chandrasegaran, Ngoc-Trung Tran, Alexander Binder, Ngai-Man Cheung
[pdf]
[DOI]

Harmonizer: Learning to Perform White-Box Image and Video Harmonization
Zhanghan Ke, Chunyi Sun, Lei Zhu, Ke Xu, Rynson W.H. Lau
[pdf]
[DOI]

Text2LIVE: Text-Driven Layered Image and Video Editing
Omer Bar-Tal, Dolev Ofri-Amar, Rafail Fridman, Yoni Kasten, Tali Dekel
[pdf]
[DOI]

Digging into Radiance Grid for Real-Time View Synthesis with Detail Preservation
Jian Zhang, Jinchi Huang, Bowen Cai, Huan Fu, Mingming Gong, Chaohui Wang, Jiaming Wang, Hongchen Luo, Rongfei Jia, Binqiang Zhao, Xing Tang
[pdf]
[DOI]

StyleGAN-Human: A Data-Centric Odyssey of Human Generation
Jianglin Fu, Shikai Li, Yuming Jiang, Kwan-Yee Lin, Chen Qian, Chen Change Loy, Wayne Wu, Ziwei Liu
[pdf]
[DOI]

ColorFormer: Image Colorization via Color Memory Assisted Hybrid-Attention Transformer
Xiaozhong Ji, Boyuan Jiang, Donghao Luo, Guangpin Tao, Wenqing Chu, Zhifeng Xie, Chengjie Wang, Ying Tai
[pdf]
[DOI]

EAGAN: Efficient Two-Stage Evolutionary Architecture Search for GANs
Guohao Ying, Xin He, Bin Gao, Bo Han, Xiaowen Chu
[pdf]
[DOI]

Weakly-Supervised Stitching Network for Real-World Panoramic Image Generation
Dae-Young Song, Geonsoo Lee, HeeKyung Lee, Gi-Mun Um, Donghyeon Cho
[pdf]
[DOI]

DynaST: Dynamic Sparse Transformer for Exemplar-Guided Image Generation
Songhua Liu, Jingwen Ye, Sucheng Ren, Xinchao Wang
[pdf]
[DOI]

Multimodal Conditional Image Synthesis with Product-of-Experts GANs
Xun Huang, Arun Mallya, Ting-Chun Wang, Ming-Yu Liu
[pdf]
[DOI]

Auto-Regressive Image Synthesis with Integrated Quantization
Fangneng Zhan, Yingchen Yu, Rongliang Wu, Jiahui Zhang, Kaiwen Cui, Changgong Zhang, Shijian Lu
[pdf]
[DOI]

JoJoGAN: One Shot Face Stylization
Min Jin Chong, David Forsyth
[pdf]
[DOI]

VecGAN: Image-to-Image Translation with Interpretable Latent Directions
Yusuf Dalva, Said Fahri Altındiş, Aysegul Dundar
[pdf]
[DOI]

Any-Resolution Training for High-Resolution Image Synthesis
Lucy Chai, Michaël Gharbi, Eli Shechtman, Phillip Isola, Richard Zhang
[pdf]
[DOI]

CCPL: Contrastive Coherence Preserving Loss for Versatile Style Transfer
Zijie Wu, Zhen Zhu, Junping Du, Xiang Bai
[pdf]
[DOI]

CANF-VC: Conditional Augmented Normalizing Flows for Video Compression
Yung-Han Ho, Chih-Peng Chang, Peng-Yu Chen, Alessandro Gnutti, Wen-Hsiao Peng
[pdf]
[DOI]

Bi-Level Feature Alignment for Versatile Image Translation and Manipulation
Fangneng Zhan, Yingchen Yu, Rongliang Wu, Jiahui Zhang, Kaiwen Cui, Aoran Xiao, Shijian Lu, Chunyan Miao
[pdf]
[DOI]

High-Fidelity Image Inpainting with GAN Inversion
Yongsheng Yu, Libo Zhang, Heng Fan, Tiejian Luo
[pdf]
[DOI]

DeltaGAN: Towards Diverse Few-Shot Image Generation with Sample-Specific Delta
Yan Hong, Li Niu, Jianfu Zhang, Liqing Zhang
[pdf]
[DOI]

Image Inpainting with Cascaded Modulation GAN and Object-Aware Training
Haitian Zheng, Zhe Lin, Jingwan Lu, Scott Cohen, Eli Shechtman, Connelly Barnes, Jianming Zhang, Ning Xu, Sohrab Amirghodsi, Jiebo Luo
[pdf]
[DOI]

StyleFace: Towards Identity-Disentangled Face Generation on Megapixels
Yuchen Luo, Junwei Zhu, Keke He, Wenqing Chu, Ying Tai, Chengjie Wang, Junchi Yan
[pdf]
[DOI]

Video Extrapolation in Space and Time
Yunzhi Zhang, Jiajun Wu
[pdf]
[DOI]

Contrastive Learning for Diverse Disentangled Foreground Generation
Yuheng Li, Yijun Li, Jingwan Lu, Eli Shechtman, Yong Jae Lee, Krishna Kumar Singh
[pdf]
[DOI]

BIPS: Bi-modal Indoor Panorama Synthesis via Residual Depth-Aided Adversarial Learning
Changgyoon Oh, Wonjune Cho, Yujeong Chae, Daehee Park, Lin Wang, Kuk-Jin Yoon
[pdf]
[DOI]

Augmentation of rPPG Benchmark Datasets: Learning to Remove and Embed rPPG Signals via Double Cycle Consistent Learning from Unpaired Facial Videos
Cheng-Ju Hsieh, Wei-Hao Chung, Chiou-Ting Hsu
[pdf]
[DOI]

Geometry-Aware Single-Image Full-Body Human Relighting
Chaonan Ji, Tao Yu, Kaiwen Guo, Jingxin Liu, Yebin Liu
[pdf]
[DOI]

3D-Aware Indoor Scene Synthesis with Depth Priors
Zifan Shi, Yujun Shen, Jiapeng Zhu, Dit-Yan Yeung, Qifeng Chen
[pdf]
[DOI]

Deep Portrait Delighting
Joshua Weir, Junhong Zhao, Andrew Chalmers, Taehyun Rhee
[pdf]
[DOI]

Vector Quantized Image-to-Image Translation
Yu-Jie Chen, Shin-I Cheng, Wei-Chen Chiu, Hung-Yu Tseng, Hsin-Ying Lee
[pdf]
[DOI]

The Surprisingly Straightforward Scene Text Removal Method with Gated Attention and Region of Interest Generation: A Comprehensive Prominent Model Analysis
Hyeonsu Lee, Chankyu Choi
[pdf]
[DOI]

Free-Viewpoint RGB-D Human Performance Capture and Rendering
Phong Nguyen-Ha, Nikolaos Sarafianos, Christoph Lassner, Janne Heikkilä, Tony Tung
[pdf]
[DOI]

Multiview Regenerative Morphing with Dual Flows
Chih-Jung Tsai, Cheng Sun, Hwann-Tzong Chen
[pdf]
[DOI]

Hallucinating Pose-Compatible Scenes
Tim Brooks, Alexei A. Efros
[pdf]
[DOI]

Motion and Appearance Adaptation for Cross-Domain Motion Transfer
Borun Xu, Biao Wang, Jinhong Deng, Jiale Tao, Tiezheng Ge, Yuning Jiang, Wen Li, Lixin Duan
[pdf]
[DOI]

Layered Controllable Video Generation
Jiahui Huang, Yuhe Jin, Kwang Moo Yi, Leonid Sigal
[pdf]
[DOI]

Custom Structure Preservation in Face Aging
Guillermo Gomez-Trenado, Stéphane Lathuilière, Pablo Mesejo, Óscar Cordón
[pdf]
[DOI]

Spatio-Temporal Deformable Attention Network for Video Deblurring
Huicong Zhang, Haozhe Xie, Hongxun Yao
[pdf]
[DOI]

NeuMesh: Learning Disentangled Neural Mesh-Based Implicit Field for Geometry and Texture Editing
Bangbang Yang, Chong Bao, Junyi Zeng, Hujun Bao, Yinda Zhang, Zhaopeng Cui, Guofeng Zhang
[pdf]
[DOI]

NeRF for Outdoor Scene Relighting
Viktor Rudnev, Mohamed Elgharib, William Smith, Lingjie Liu, Vladislav Golyanik, Christian Theobalt
[pdf]
[DOI]

CoGS: Controllable Generation and Search from Sketch and Style
Cusuh Ham, Gemma Canet Tarrés, Tu Bui, James Hays, Zhe Lin, John Collomosse
[pdf]
[DOI]

HairNet: Hairstyle Transfer with Pose Changes
Peihao Zhu, Rameen Abdal, John Femiani, Peter Wonka
[pdf]
[DOI]

Unbiased Multi-Modality Guidance for Image Inpainting
Yongsheng Yu, Dawei Du, Libo Zhang, Tiejian Luo
[pdf]
[DOI]

Intelli-Paint: Towards Developing More Human-Intelligible Painting Agents
Jaskirat Singh, Cameron Smith, Jose Echevarria, Liang Zheng
[pdf]
[DOI]

Motion Transformer for Unsupervised Image Animation
Jiale Tao, Biao Wang, Tiezheng Ge, Yuning Jiang, Wen Li, Lixin Duan
[pdf]
[DOI]

NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
Chenfei Wu, Jian Liang, Lei Ji, Fan Yang, Yuejian Fang, Daxin Jiang, Nan Duan
[pdf]
[DOI]

EleGANt: Exquisite and Locally Editable GAN for Makeup Transfer
Chenyu Yang, Wanrong He, Yingqing Xu, Yang Gao
[pdf]
[DOI]

Editing Out-of-Domain GAN Inversion via Differential Activations
Haorui Song, Yong Du, Tianyi Xiang, Junyu Dong, Jing Qin, Shengfeng He
[pdf]
[DOI]

On the Robustness of Quality Measures for GANs
Motasem Alfarra, Juan C. Pérez, Anna Frühstück, Philip H. S. Torr, Peter Wonka, Bernard Ghanem
[pdf]
[DOI]

Sound-Guided Semantic Video Generation
Seung Hyun Lee, Gyeongrok Oh, Wonmin Byeon, Chanyoung Kim, Won Jeong Ryoo, Sang Ho Yoon, Hyunjun Cho, Jihyun Bae, Jinkyu Kim, Sangpil Kim
[pdf]
[DOI]

Inpainting at Modern Camera Resolution by Guided PatchMatch with Auto-Curation
Lingzhi Zhang, Connelly Barnes, Kevin Wampler, Sohrab Amirghodsi, Eli Shechtman, Zhe Lin, Jianbo Shi
[pdf]
[DOI]

Controllable Video Generation through Global and Local Motion Dynamics
Aram Davtyan, Paolo Favaro
[pdf]
[DOI]

StyleHEAT: One-Shot High-Resolution Editable Talking Face Generation via Pre-trained StyleGAN
Fei Yin, Yong Zhang, Xiaodong Cun, Mingdeng Cao, Yanbo Fan, Xuan Wang, Qingyan Bai, Baoyuan Wu, Jue Wang, Yujiu Yang
[pdf]
[DOI]

Long Video Generation with Time-Agnostic VQGAN and Time-Sensitive Transformer
Songwei Ge, Thomas Hayes, Harry Yang, Xi Yin, Guan Pang, David Jacobs, Jia-Bin Huang, Devi Parikh
[pdf]
[DOI]

Combining Internal and External Constraints for Unrolling Shutter in Videos
Eyal Naor, Itai Antebi, Shai Bagon, Michal Irani
[pdf]
[DOI]

WISE: Whitebox Image Stylization by Example-Based Learning
Winfried Lötzsch, Max Reimann, Martin Büssemeyer, Amir Semmo, Jürgen Döllner, Matthias Trapp
[pdf]
[DOI]

Neural Radiance Transfer Fields for Relightable Novel-View Synthesis with Global Illumination
Linjie Lyu, Ayush Tewari, Thomas Leimkühler, Marc Habermann, Christian Theobalt
[pdf]
[DOI]

Transformers As Meta-Learners for Implicit Neural Representations
Yinbo Chen, Xiaolong Wang
[pdf]
[DOI]

Style Your Hair: Latent Optimization for Pose-Invariant Hairstyle Transfer via Local-Style-Aware Hair Alignment
Taewoo Kim, Chaeyeon Chung, Yoonseo Kim, Sunghyun Park, Kangyeol Kim, Jaegul Choo
[pdf]
[DOI]

High-Resolution Virtual Try-On with Misalignment and Occlusion-Handled Conditions
Sangyun Lee, Gyojung Gu, Sunghyun Park, Seunghwan Choi, Jaegul Choo
[pdf]
[DOI]

A Codec Information Assisted Framework for Efficient Compressed Video Super-Resolution
Hengsheng Zhang, Xueyi Zou, Jiaming Guo, Youliang Yan, Rong Xie, Li Song
[pdf]
[DOI]

Injecting 3D Perception of Controllable NeRF-GAN into StyleGAN for Editable Portrait Image Synthesis
Jeong-gi Kwak, Yuanming Li, Dongsik Yoon, Donghyeon Kim, David Han, Hanseok Ko
[pdf]
[DOI]

AdaNeRF: Adaptive Sampling for Real-Time Rendering of Neural Radiance Fields
Andreas Kurz, Thomas Neff, Zhaoyang Lv, Michael Zollhöfer, Markus Steinberger
[pdf]
[DOI]

Improving the Perceptual Quality of 2D Animation Interpolation
Shuhong Chen, Matthias Zwicker
[pdf]
[DOI]

Selective TransHDR: Transformer-Based Selective HDR Imaging Using Ghost Region Mask
Jou Won Song, Ye-In Park, Kyeongbo Kong, Jaeho Kwak, Suk-Ju Kang
[pdf]
[DOI]

Learning Series-Parallel Lookup Tables for Efficient Image Super-Resolution
Cheng Ma, Jingyi Zhang, Jie Zhou, Jiwen Lu
[pdf]
[DOI]

GeoAug: Data Augmentation for Few-Shot NeRF with Geometry Constraints
Di Chen, Yu Liu, Lianghua Huang, Bin Wang, Pan Pan
[pdf]
[DOI]

DoodleFormer: Creative Sketch Drawing with Transformers
Ankan Kumar Bhunia, Salman Khan, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan, Jorma Laaksonen, Michael Felsberg
[pdf]
[DOI]

Implicit Neural Representations for Variable Length Human Motion Generation
Pablo Cervantes, Yusuke Sekikawa, Ikuro Sato, Koichi Shinoda
[pdf]
[DOI]

Learning Object Placement via Dual-Path Graph Completion
Siyuan Zhou, Liu Liu, Li Niu, Liqing Zhang
[pdf]
[DOI]

Expanded Adaptive Scaling Normalization for End to End Image Compression
Chajin Shin, Hyeongmin Lee, Hanbin Son, Sangjin Lee, Dogyoon Lee, Sangyoun Lee
[pdf]
[DOI]

Generator Knows What Discriminator Should Learn in Unconditional GANs
Gayoung Lee, Hyunsu Kim, Junho Kim, Seonghyeon Kim, Jung-Woo Ha, Yunjey Choi
[pdf]
[DOI]

Compositional Visual Generation with Composable Diffusion Models
Nan Liu, Shuang Li, Yilun Du, Antonio Torralba, Joshua B. Tenenbaum
[pdf]
[DOI]

ManiFest: Manifold Deformation for Few-Shot Image Translation
Fabio Pizzati, Jean-François Lalonde, Raoul de Charette
[pdf]
[DOI]

Supervised Attribute Information Removal and Reconstruction for Image Manipulation
Nannan Li, Bryan A. Plummer
[pdf]
[DOI]

BLT: Bidirectional Layout Transformer for Controllable Layout Generation
Xiang Kong, Lu Jiang, Huiwen Chang, Han Zhang, Yuan Hao, Haifeng Gong, Irfan Essa
[pdf]
[DOI]

Diverse Generation from a Single Video Made Possible
Niv Haim, Ben Feinstein, Niv Granot, Assaf Shocher, Shai Bagon, Tali Dekel, Michal Irani
[pdf]
[DOI]

Rayleigh EigenDirections (REDs): Nonlinear GAN Latent Space Traversals for Multidimensional Features
Guha Balakrishnan, Raghudeep Gadde, Aleix Martinez, Pietro Perona
[pdf]
[DOI]

Bridging the Domain Gap towards Generalization in Automatic Colorization
Hyejin Lee, Daehee Kim, Daeun Lee, Jinkyu Kim, Jaekoo Lee
[pdf]
[DOI]

Generating Natural Images with Direct Patch Distributions Matching
Ariel Elnekave, Yair Weiss
[pdf]
[DOI]

Context-Consistent Semantic Image Editing with Style-Preserved Modulation
Wuyang Luo, Su Yang, Hong Wang, Bo Long, Weishan Zhang
[pdf]
[DOI]

Eliminating Gradient Conflict in Reference-Based Line-Art Colorization
Zekun Li, Zhengyang Geng, Zhao Kang, Wenyu Chen, Yibo Yang
[pdf]
[DOI]

Unsupervised Learning of Efficient Geometry-Aware Neural Articulated Representations
Atsuhiro Noguchi, Xiao Sun, Stephen Lin, Tatsuya Harada
[pdf]
[DOI]

JPEG Artifacts Removal via Contrastive Representation Learning
Xi Wang, Xueyang Fu, Yurui Zhu, Zheng-Jun Zha
[pdf]
[DOI]

Unpaired Deep Image Dehazing Using Contrastive Disentanglement Learning
Xiang Chen, Zhentao Fan, Pengpeng Li, Longgang Dai, Caihua Kong, Zhuoran Zheng, Yufeng Huang, Yufeng Li
[pdf]
[DOI]

Efficient Long-Range Attention Network for Image Super-Resolution
Xindong Zhang, Hui Zeng, Shi Guo, Lei Zhang
[pdf]
[DOI]

FlowFormer: A Transformer Architecture for Optical Flow
Zhaoyang Huang, Xiaoyu Shi, Chao Zhang, Qiang Wang, Ka Chun Cheung, Hongwei Qin, Jifeng Dai, Hongsheng Li
[pdf]
[DOI]

Coarse-to-Fine Sparse Transformer for Hyperspectral Image Reconstruction
Yuanhao Cai, Jing Lin, Xiaowan Hu, Haoqian Wang, Xin Yuan, Yulun Zhang, Radu Timofte, Luc Van Gool
[pdf]
[DOI]

Learning Shadow Correspondence for Video Shadow Detection
Xinpeng Ding, Jingwen Yang, Xiaowei Hu, Xiaomeng Li
[pdf]
[DOI]

Metric Learning Based Interactive Modulation for Real-World Super-Resolution
Chong Mou, Yanze Wu, Xintao Wang, Chao Dong, Jian Zhang, Ying Shan
[pdf]
[DOI]

Dynamic Dual Trainable Bounds for Ultra-Low Precision Super-Resolution Networks
Yunshan Zhong, Mingbao Lin, Xunchao Li, Ke Li, Yunhang Shen, Fei Chao, Yongjian Wu, Rongrong Ji
[pdf]
[DOI]

OSFormer: One-Stage Camouflaged Instance Segmentation with Transformers
Jialun Pei, Tianyang Cheng, Deng-Ping Fan, He Tang, Chuanbo Chen, Luc Van Gool
[pdf]
[DOI]

Highly Accurate Dichotomous Image Segmentation
Xuebin Qin, Hang Dai, Xiaobin Hu, Deng-Ping Fan, Ling Shao, Luc Van Gool
[pdf]
[DOI]

Boosting Supervised Dehazing Methods via Bi-Level Patch Reweighting
Xingyu Jiang, Hongkun Dou, Chengwei Fu, Bingquan Dai, Tianrun Xu, Yue Deng
[pdf]
[DOI]

Flow-Guided Transformer for Video Inpainting
Kaidong Zhang, Jingjing Fu, Dong Liu
[pdf]
[DOI]

Shift-tolerant Perceptual Similarity Metric
Abhijay Ghildyal, Feng Liu
[pdf]
[DOI]

Perception-Distortion Balanced ADMM Optimization for Single-Image Super-Resolution
Yuehan Zhang, Bo Ji, Jia Hao, Angela Yao
[pdf]
[DOI]

VQFR: Blind Face Restoration with Vector-Quantized Dictionary and Parallel Decoder
Yuchao Gu, Xintao Wang, Liangbin Xie, Chao Dong, Gen Li, Ying Shan, Ming-Ming Cheng
[pdf]
[DOI]

Uncertainty Learning in Kernel Estimation for Multi-stage Blind Image Super-Resolution
Zhenxuan Fang, Weisheng Dong, Xin Li, Jinjian Wu, Leida Li, Guangming Shi
[pdf]
[DOI]

Learning Spatio-Temporal Downsampling for Effective Video Upscaling
Xiaoyu Xiang, Yapeng Tian, Vijay Rengarajan, Lucas D. Young, Bo Zhu, Rakesh Ranjan
[pdf]
[DOI]

Learning Local Implicit Fourier Representation for Image Warping
Jaewon Lee, Kwang Pyo Choi, Kyong Hwan Jin
[pdf]
[DOI]

SepLUT: Separable Image-Adaptive Lookup Tables for Real-Time Image Enhancement
Canqian Yang, Meiguang Jin, Yi Xu, Rui Zhang, Ying Chen, Huaida Liu
[pdf]
[DOI]

Blind Image Decomposition
Junlin Han, Weihao Li, Pengfei Fang, Chunyi Sun, Jie Hong, Mohammad Ali Armin, Lars Petersson, Hongdong Li
[pdf]
[DOI]

MuLUT: Cooperating Multiple Look-Up Tables for Efficient Image Super-Resolution
Jiacheng Li, Chang Chen, Zhen Cheng, Zhiwei Xiong
[pdf]
[DOI]

Learning Spatiotemporal Frequency-Transformer for Compressed Video Super-Resolution
Zhongwei Qiu, Huan Yang, Jianlong Fu, Dongmei Fu
[pdf]
[DOI]

Spatial-Frequency Domain Information Integration for Pan-Sharpening
Man Zhou, Jie Huang, Keyu Yan, Hu Yu, Xueyang Fu, Aiping Liu, Xian Wei, Feng Zhao
[pdf]
[DOI]

Adaptive Patch Exiting for Scalable Single Image Super-Resolution
Shizun Wang, Jiaming Liu, Kaixin Chen, Xiaoqi Li, Ming Lu, Yandong Guo
[pdf]
[DOI]

Efficient Meta-Tuning for Content-Aware Neural Video Delivery
Xiaoqi Li, Jiaming Liu, Shizun Wang, Cheng Lyu, Ming Lu, Yurong Chen, Anbang Yao, Yandong Guo, Shanghang Zhang
[pdf]
[DOI]

Reference-Based Image Super-Resolution with Deformable Attention Transformer
Jiezhang Cao, Jingyun Liang, Kai Zhang, Yawei Li, Yulun Zhang, Wenguan Wang, Luc Van Gool
[pdf]
[DOI]

Local Color Distributions Prior for Image Enhancement
Haoyuan Wang, Ke Xu, Rynson W.H. Lau
[pdf]
[DOI]

L-CoDer: Language-Based Colorization with Color-Object Decoupling Transformer
Zheng Chang, Shuchen Weng, Yu Li, Si Li, Boxin Shi
[pdf]
[DOI]

From Face to Natural Image: Learning Real Degradation for Blind Image Super-Resolution
Xiaoming Li, Chaofeng Chen, Xianhui Lin, Wangmeng Zuo, Lei Zhang
[pdf]
[DOI]

Towards Interpretable Video Super-Resolution via Alternating Optimization
Jiezhang Cao, Jingyun Liang, Kai Zhang, Wenguan Wang, Qin Wang, Yulun Zhang, Hao Tang, Luc Van Gool
[pdf]
[DOI]

Event-Based Fusion for Motion Deblurring with Cross-Modal Attention
Lei Sun, Christos Sakaridis, Jingyun Liang, Qi Jiang, Kailun Yang, Peng Sun, Yaozu Ye, Kaiwei Wang, Luc Van Gool
[pdf]
[DOI]

Fast and High Quality Image Denoising via Malleable Convolution
Yifan Jiang, Bartlomiej Wronski, Ben Mildenhall, Jonathan T. Barron, Zhangyang Wang, Tianfan Xue
[pdf]
[DOI]

TAPE: Task-Agnostic Prior Embedding for Image Restoration
Lin Liu, Lingxi Xie, Xiaopeng Zhang, Shanxin Yuan, Xiangyu Chen, Wengang Zhou, Houqiang Li, Qi Tian
[pdf]
[DOI]

Uncertainty Inspired Underwater Image Enhancement
Zhenqi Fu, Wu Wang, Yue Huang, Xinghao Ding, Kai-Kuang Ma
[pdf]
[DOI]

Hourglass Attention Network for Image Inpainting
Ye Deng, Siqi Hui, Rongye Meng, Sanping Zhou, Jinjun Wang
[pdf]
[DOI]

Unfolded Deep Kernel Estimation for Blind Image Super-Resolution
Hongyi Zheng, Hongwei Yong, Lei Zhang
[pdf]
[DOI]

Event-Guided Deblurring of Unknown Exposure Time Videos
Taewoo Kim, Jeongmin Lee, Lin Wang, Kuk-Jin Yoon
[pdf]
[DOI]

ReCoNet: Recurrent Correction Network for Fast and Efficient Multi-Modality Image Fusion
Zhanbo Huang, Jinyuan Liu, Xin Fan, Risheng Liu, Wei Zhong, Zhongxuan Luo
[pdf]
[DOI]

Content Adaptive Latents and Decoder for Neural Image Compression
Guanbo Pan, Guo Lu, Zhihao Hu, Dong Xu
[pdf]
[DOI]

Efficient and Degradation-Adaptive Network for Real-World Image Super-Resolution
Jie Liang, Hui Zeng, Lei Zhang
[pdf]
[DOI]

Unidirectional Video Denoising by Mimicking Backward Recurrent Modules with Look-Ahead Forward Ones
Junyi Li, Xiaohe Wu, Zhenxing Niu, Wangmeng Zuo
[pdf]
[DOI]

Self-Supervised Learning for Real-World Super-Resolution from Dual Zoomed Observations
Zhilu Zhang, Ruohao Wang, Hongzhi Zhang, Yunjin Chen, Wangmeng Zuo
[pdf]
[DOI]

Secrets of Event-Based Optical Flow
Shintaro Shiba, Yoshimitsu Aoki, Guillermo Gallego
[pdf]
[DOI]

Towards Efficient and Scale-Robust Ultra-High-Definition Image Demoiréing
Xin Yu, Peng Dai, Wenbo Li, Lan Ma, Jiajun Shen, Jia Li, Xiaojuan Qi
[pdf]
[DOI]

ERDN: Equivalent Receptive Field Deformable Network for Video Deblurring
Bangrui Jiang, Zhihuai Xie, Zhen Xia, Songnan Li, Shan Liu
[pdf]
[DOI]

Rethinking Generic Camera Models for Deep Single Image Camera Calibration to Recover Rotation and Fisheye Distortion
Nobuhiko Wakai, Satoshi Sato, Yasunori Ishii, Takayoshi Yamashita
[pdf]
[DOI]

ART-SS: An Adaptive Rejection Technique for Semi-Supervised Restoration for Adverse Weather-Affected Images
Rajeev Yasarla, Carey E. Priebe, Vishal M. Patel
[pdf]
[DOI]

Fusion from Decomposition: A Self-Supervised Decomposition Approach for Image Fusion
Pengwei Liang, Junjun Jiang, Xianming Liu, Jiayi Ma
[pdf]
[DOI]

Learning Degradation Representations for Image Deblurring
Dasong Li, Yi Zhang, Ka Chun Cheung, Xiaogang Wang, Hongwei Qin, Hongsheng Li
[pdf]
[DOI]

Learning Mutual Modulation for Self-Supervised Cross-Modal Super-Resolution
Xiaoyu Dong, Naoto Yokoya, Longguang Wang, Tatsumi Uezato
[pdf]
[DOI]

Spectrum-Aware and Transferable Architecture Search for Hyperspectral Image Restoration
Wei He, Quanming Yao, Naoto Yokoya, Tatsumi Uezato, Hongyan Zhang, Liangpei Zhang
[pdf]
[DOI]

Neural Color Operators for Sequential Image Retouching
Yili Wang, Xin Li, Kun Xu, Dongliang He, Qi Zhang, Fu Li, Errui Ding
[pdf]
[DOI]

Optimizing Image Compression via Joint Learning with Denoising
Ka Leong Cheng, Yueqi Xie, Qifeng Chen
[pdf]
[DOI]

"Restore Globally, Refine Locally: A Mask-Guided Scheme to Accelerate Super-Resolution Networks"
Xiaotao Hu, Jun Xu, Shuhang Gu, Ming-Ming Cheng, Li Liu
[pdf]
[DOI]

Compiler-Aware Neural Architecture Search for On-Mobile Real-Time Super-Resolution
Yushu Wu, Yifan Gong, Pu Zhao, Yanyu Li, Zheng Zhan, Wei Niu, Hao Tang, Minghai Qin, Bin Ren, Yanzhi Wang
[pdf]
[DOI]

Modeling Mask Uncertainty in Hyperspectral Image Reconstruction
Jiamian Wang, Yulun Zhang, Xin Yuan, Ziyi Meng, Zhiqiang Tao
[pdf]
[DOI]

Perceiving and Modeling Density for Image Dehazing
Tian Ye, Yunchen Zhang, Mingchao Jiang, Liang Chen, Yun Liu, Sixiang Chen, Erkang Chen
[pdf]
[DOI]

Stripformer: Strip Transformer for Fast Image Deblurring
Fu-Jen Tsai, Yan-Tsung Peng, Yen-Yu Lin, Chung-Chi Tsai, Chia-Wen Lin
[pdf]
[DOI]

Deep Fourier-Based Exposure Correction Network with Spatial-Frequency Interaction
Jie Huang, Yajing Liu, Feng Zhao, Keyu Yan, Jinghao Zhang, Yukun Huang, Man Zhou, Zhiwei Xiong
[pdf]
[DOI]

Frequency and Spatial Dual Guidance for Image Dehazing
Hu Yu, Naishan Zheng, Man Zhou, Jie Huang, Zeyu Xiao, Feng Zhao
[pdf]
[DOI]

Towards Real-World HDRTV Reconstruction: A Data Synthesis-Based Approach
Zhen Cheng, Tao Wang, Yong Li, Fenglong Song, Chang Chen, Zhiwei Xiong
[pdf]
[DOI]

Learning Discriminative Shrinkage Deep Networks for Image Deconvolution
Pin-Hung Kuo, Jinshan Pan, Shao-Yi Chien, Ming-Hsuan Yang
[pdf]
[DOI]

KXNet: A Model-Driven Deep Neural Network for Blind Super-Resolution
Jiahong Fu, Hong Wang, Qi Xie, Qian Zhao, Deyu Meng, Zongben Xu
[pdf]
[DOI]

ARM: Any-Time Super-Resolution Method
Bohong Chen, Mingbao Lin, Kekai Sheng, Mengdan Zhang, Peixian Chen, Ke Li, Liujuan Cao, Rongrong Ji
[pdf]
[DOI]

Attention-Aware Learning for Hyperparameter Prediction in Image Processing Pipelines
Haina Qin, Longfei Han, Juan Wang, Congxuan Zhang, Yanwei Li, Bing Li, Weiming Hu
[pdf]
[DOI]

RealFlow: EM-Based Realistic Optical Flow Dataset Generation from Videos
Yunhui Han, Kunming Luo, Ao Luo, Jiangyu Liu, Haoqiang Fan, Guiming Luo, Shuaicheng Liu
[pdf]
[DOI]

Memory-Augmented Model-Driven Network for Pansharpening
Keyu Yan, Man Zhou, Li Zhang, Chengjun Xie
[pdf]
[DOI]

All You Need Is RAW: Defending against Adversarial Attacks with Camera Image Pipelines
Yuxuan Zhang, Bo Dong, Felix Heide
[pdf]
[DOI]

Ghost-Free High Dynamic Range Imaging with Context-Aware Transformer
Zhen Liu, Yinglong Wang, Bing Zeng, Shuaicheng Liu
[pdf]
[DOI]

Style-Guided Shadow Removal
Jin Wan, Hui Yin, Zhenyao Wu, Xinyi Wu, Yanting Liu, Song Wang
[pdf]
[DOI]

D2C-SR: A Divergence to Convergence Approach for Real-World Image Super-Resolution
Youwei Li, Haibin Huang, Lanpeng Jia, Haoqiang Fan, Shuaicheng Liu
[pdf]
[DOI]

GRIT-VLP: Grouped Mini-Batch Sampling for Efficient Vision and Language Pre-training
Jaeseok Byun, Taebaek Hwang, Jianlong Fu, Taesup Moon
[pdf]
[DOI]

Efficient Video Deblurring Guided by Motion Magnitude
Yusheng Wang, Yunfan Lu, Ye Gao, Lin Wang, Zhihang Zhong, Yinqiang Zheng, Atsushi Yamashita
[pdf]
[DOI]

Single Frame Atmospheric Turbulence Mitigation: A Benchmark Study and a New Physics-Inspired Transformer Model
Zhiyuan Mao, Ajay Jaiswal, Zhangyang Wang, Stanley H. Chan
[pdf]
[DOI]

Contextformer: A Transformer with Spatio-Channel Attention for Context Modeling in Learned Image Compression
A. Burakhan Koyuncu, Han Gao, Atanas Boev, Georgii Gaikov, Elena Alshina, Eckehard Steinbach
[pdf]
[DOI]

Image Super-Resolution with Deep Dictionary
Shunta Maeda
[pdf]
[DOI]

TempFormer: Temporally Consistent Transformer for Video Denoising
Mingyang Song, Yang Zhang, Tunç O. Aydın
[pdf]
[DOI]

RAWtoBit: A Fully End-to-End Camera ISP Network
Wooseok Jeong, Seung-Won Jung
[pdf]
[DOI]

DRCNet: Dynamic Image Restoration Contrastive Network
Fei Li, Lingfeng Shen, Yang Mi, Zhenbo Li
[pdf]
[DOI]

Zero-Shot Learning for Reflection Removal of Single 360-Degree Image
Byeong-Ju Han, Jae-Young Sim
[pdf]
[DOI]

Transformer with Implicit Edges for Particle-Based Physics Simulation
Yidi Shao, Chen Change Loy, Bo Dai
[pdf]
[DOI]

Rethinking Video Rain Streak Removal: A New Synthesis Model and a Deraining Network with Video Rain Prior
Shuai Wang, Lei Zhu, Huazhu Fu, Jing Qin, Carola-Bibiane Schönlieb, Wei Feng, Song Wang
[pdf]
[DOI]

Super-Resolution by Predicting Offsets: An Ultra-Efficient Super-Resolution Network for Rasterized Images
Jinjin Gu, Haoming Cai, Chenyu Dong, Ruofan Zhang, Yulun Zhang, Wenming Yang, Chun Yuan
[pdf]
[DOI]

Animation from Blur: Multi-modal Blur Decomposition with Motion Guidance
Zhihang Zhong, Xiao Sun, Zhirong Wu, Yinqiang Zheng, Stephen Lin, Imari Sato
[pdf]
[DOI]

AlphaVC: High-Performance and Efficient Learned Video Compression
Yibo Shi, Yunying Ge, Jing Wang, Jue Mao
[pdf]
[DOI]

Content-Oriented Learned Image Compression
Meng Li, Shangyin Gao, Yihui Feng, Yibo Shi, Jing Wang
[pdf]
[DOI]

RRSR:Reciprocal Reference-Based Image Super-Resolution with Progressive Feature Alignment and Selection
Lin Zhang, Xin Li, Dongliang He, Fu Li, Yili Wang, Zhaoxiang Zhang
[pdf]
[DOI]

Contrastive Prototypical Network with Wasserstein Confidence Penalty
Haoqing Wang, Zhi-Hong Deng
[pdf]
[DOI]

Learn-to-Decompose: Cascaded Decomposition Network for Cross-Domain Few-Shot Facial Expression Recognition
Xinyi Zou, Yan Yan, Jing-Hao Xue, Si Chen, Hanzi Wang
[pdf]
[DOI]

Self-Support Few-Shot Semantic Segmentation
Qi Fan, Wenjie Pei, Yu-Wing Tai, Chi-Keung Tang
[pdf]
[DOI]

Few-Shot Object Detection with Model Calibration
Qi Fan, Chi-Keung Tang, Yu-Wing Tai
[pdf]
[DOI]

Self-Supervision Can Be a Good Few-Shot Learner
Yuning Lu, Liangjian Wen, Jianzhuang Liu, Yajing Liu, Xinmei Tian
[pdf]
[DOI]

tSF: Transformer-Based Semantic Filter for Few-Shot Learning
Jinxiang Lai, Siqian Yang, Wenlong Liu, Yi Zeng, Zhongyi Huang, Wenlong Wu, Jun Liu, Bin-Bin Gao, Chengjie Wang
[pdf]
[DOI]

Adversarial Feature Augmentation for Cross-Domain Few-Shot Classification
Yanxu Hu, Andy J. Ma
[pdf]
[DOI]

Constructing Balance from Imbalance for Long-Tailed Image Recognition
Yue Xu, Yong-Lu Li, Jiefeng Li, Cewu Lu
[pdf]
[DOI]

"On Multi-Domain Long-Tailed Recognition, Imbalanced Domain Generalization and Beyond"
Yuzhe Yang, Hao Wang, Dina Katabi
[pdf]
[DOI]

Few-Shot Video Object Detection
Qi Fan, Chi-Keung Tang, Yu-Wing Tai
[pdf]
[DOI]

Worst Case Matters for Few-Shot Recognition
Minghao Fu, Yun-Hao Cao, Jianxin Wu
[pdf]
[DOI]

Exploring Hierarchical Graph Representation for Large-Scale Zero-Shot Image Classification
Kai Yi, Xiaoqian Shen, Yunhao Gou, Mohamed Elhoseiny
[pdf]
[DOI]

Doubly Deformable Aggregation of Covariance Matrices for Few-Shot Segmentation
Zhitong Xiong, Haopeng Li, Xiao Xiang Zhu
[pdf]
[DOI]

Dense Cross-Query-and-Support Attention Weighted Mask Aggregation for Few-Shot Segmentation
Xinyu Shi, Dong Wei, Yu Zhang, Donghuan Lu, Munan Ning, Jiashun Chen, Kai Ma, Yefeng Zheng
[pdf]
[DOI]

Rethinking Clustering-Based Pseudo-Labeling for Unsupervised Meta-Learning
Xingping Dong, Jianbing Shen, Ling Shao
[pdf]
[DOI]

CLASTER: Clustering with Reinforcement Learning for Zero-Shot Action Recognition
Shreyank N Gowda, Laura Sevilla-Lara, Frank Keller, Marcus Rohrbach
[pdf]
[DOI]

Few-Shot Class-Incremental Learning for 3D Point Cloud Objects
Townim Chowdhury, Ali Cheraghian, Sameera Ramasinghe, Sahar Ahmadi, Morteza Saberi, Shafin Rahman
[pdf]
[DOI]

Meta-Learning with Less Forgetting on Large-Scale Non-stationary Task Distributions
Zhenyi Wang, Li Shen, Le Fang, Qiuling Suo, Donglin Zhan, Tiehang Duan, Mingchen Gao
[pdf]
[DOI]

DNA: Improving Few-Shot Transfer Learning with Low-Rank Decomposition and Alignment
Ziyu Jiang, Tianlong Chen, Xuxi Chen, Yu Cheng, Luowei Zhou, Lu Yuan, Ahmed Awadallah, Zhangyang Wang
[pdf]
[DOI]

Learning Instance and Task-Aware Dynamic Kernels for Few-Shot Learning
Rongkai Ma, Pengfei Fang, Gil Avraham, Yan Zuo, Tianyu Zhu, Tom Drummond, Mehrtash Harandi
[pdf]
[DOI]

Open-World Semantic Segmentation via Contrasting and Clustering Vision-Language Embedding
Quande Liu, Youpeng Wen, Jianhua Han, Chunjing Xu, Hang Xu, Xiaodan Liang
[pdf]
[DOI]

Few-Shot Classification with Contrastive Learning
Zhanyuan Yang, Jinghua Wang, Yingying Zhu
[pdf]
[DOI]

Time-rEversed diffusioN tEnsor Transformer: A New TENET of Few-Shot Object Detection
Shan Zhang, Naila Murray, Lei Wang, Piotr Koniusz
[pdf]
[DOI]

Self-Promoted Supervision for Few-Shot Transformer
Bowen Dong, Pan Zhou, Shuicheng Yan, Wangmeng Zuo
[pdf]
[DOI]

Few-Shot Object Counting and Detection
Thanh Nguyen, Chau Pham, Khoi Nguyen, Minh Hoai
[pdf]
[DOI]

Rethinking Few-Shot Object Detection on a Multi-Domain Benchmark
Kibok Lee, Hao Yang, Satyaki Chakraborty, Zhaowei Cai, Gurumurthy Swaminathan, Avinash Ravichandran, Onkar Dabeer
[pdf]
[DOI]

Cross-Domain Cross-Set Few-Shot Learning via Learning Compact and Aligned Representations
Wentao Chen, Zhang Zhang, Wei Wang, Liang Wang, Zilei Wang, Tieniu Tan
[pdf]
[DOI]

Mutually Reinforcing Structure with Proposal Contrastive Consistency for Few-Shot Object Detection
Tianxue Ma, Mingwei Bi, Jian Zhang, Wang Yuan, Zhizhong Zhang, Yuan Xie, Shouhong Ding, Lizhuang Ma
[pdf]
[DOI]

Dual Contrastive Learning with Anatomical Auxiliary Supervision for Few-Shot Medical Image Segmentation
Huisi Wu, Fangyan Xiao, Chongxin Liang
[pdf]
[DOI]

Improving Few-Shot Learning through Multi-task Representation Learning Theory
Quentin Bouniot, Ievgen Redko, Romaric Audigier, Angélique Loesch, Amaury Habrard
[pdf]
[DOI]

Tree Structure-Aware Few-Shot Image Classification via Hierarchical Aggregation
Min Zhang, Siteng Huang, Wenbin Li, Donglin Wang
[pdf]
[DOI]

Inductive and Transductive Few-Shot Video Classification via Appearance and Temporal Alignments
Khoi D. Nguyen, Quoc-Huy Tran, Khoi Nguyen, Binh-Son Hua, Rang Nguyen
[pdf]
[DOI]

Temporal and Cross-Modal Attention for Audio-Visual Zero-Shot Learning
Otniel-Bogdan Mercea, Thomas Hummel, A. Sophia Koepke, Zeynep Akata
[pdf]
[DOI]

HM: Hybrid Masking for Few-Shot Segmentation
Seonghyeon Moon, Samuel S. Sohn, Honglu Zhou, Sejong Yoon, Vladimir Pavlovic, Muhammad Haris Khan, Mubbasir Kapadia
[pdf]
[DOI]

TransVLAD: Focusing on Locally Aggregated Descriptors for Few-Shot Learning
Haoquan Li, Laoming Zhang, Daoan Zhang, Lang Fu, Peng Yang, Jianguo Zhang
[pdf]
[DOI]

Kernel Relative-Prototype Spectral Filtering for Few-Shot Learning
Tao Zhang, Wu Huang
[pdf]
[DOI]

"“This Is My Unicorn, Fluffy”: Personalizing Frozen Vision-Language Representations"
Niv Cohen, Rinon Gal, Eli A. Meirom, Gal Chechik, Yuval Atzmon
[pdf]
[DOI]

CLOSE: Curriculum Learning on the Sharing Extent towards Better One-Shot NAS
Zixuan Zhou, Xuefei Ning, Yi Cai, Jiashu Han, Yiping Deng, Yuhan Dong, Huazhong Yang, Yu Wang
[pdf]
[DOI]

Streamable Neural Fields
Junwoo Cho, Seungtae Nam, Daniel Rho, Jong Hwan Ko, Eunbyung Park
[pdf]
[DOI]

Gradient-Based Uncertainty for Monocular Depth Estimation
Julia Hornauer, Vasileios Belagiannis
[pdf]
[DOI]

Online Continual Learning with Contrastive Vision Transformer
Zhen Wang, Liu Liu, Yajing Kong, Jiaxian Guo, Dacheng Tao
[pdf]
[DOI]

CPrune: Compiler-Informed Model Pruning for Efficient Target-Aware DNN Execution
Taeho Kim, Yongin Kwon, Jemin Lee, Taeho Kim, Sangtae Ha
[pdf]
[DOI]

EAutoDet: Efficient Architecture Search for Object Detection
Xiaoxing Wang, Jiale Lin, Juanping Zhao, Xiaokang Yang, Junchi Yan
[pdf]
[DOI]

A Max-Flow Based Approach for Neural Architecture Search
Chao Xue, Xiaoxing Wang, Junchi Yan, Chun-Guang Li
[pdf]
[DOI]

OccamNets: Mitigating Dataset Bias by Favoring Simpler Hypotheses
Robik Shrestha, Kushal Kafle, Christopher Kanan
[pdf]
[DOI]

ERA: Enhanced Rational Activations
Martin Trimmel, Mihai Zanfir, Richard Hartley, Cristian Sminchisescu
[pdf]
[DOI]

Convolutional Embedding Makes Hierarchical Vision Transformer Stronger
Cong Wang, Hongmin Xu, Xiong Zhang, Li Wang, Zhitong Zheng, Haifeng Liu
[pdf]
[DOI]

Active Label Correction Using Robust Parameter Update and Entropy Propagation
Kwang In Kim
[pdf]
[DOI]

Unpaired Image Translation via Vector Symbolic Architectures
Justin Theiss, Jay Leverett, Daeil Kim, Aayush Prakash
[pdf]
[DOI]

"UniNet: Unified Architecture Search with Convolution, Transformer, and MLP"
Jihao Liu, Xin Huang, Guanglu Song, Hongsheng Li, Yu Liu
[pdf]
[DOI]

AMixer: Adaptive Weight Mixing for Self-Attention Free Vision Transformers
Yongming Rao, Wenliang Zhao, Jie Zhou, Jiwen Lu
[pdf]
[DOI]

TinyViT: Fast Pretraining Distillation for Small Vision Transformers
Kan Wu, Jinnian Zhang, Houwen Peng, Mengchen Liu, Bin Xiao, Jianlong Fu, Lu Yuan
[pdf]
[DOI]

Equivariant Hypergraph Neural Networks
Jinwoo Kim, Saeyoon Oh, Sungjun Cho, Seunghoon Hong
[pdf]
[DOI]

ScaleNet: Searching for the Model to Scale
Jiyang Xie, Xiu Su, Shan You, Zhanyu Ma, Fei Wang, Chen Qian
[pdf]
[DOI]

Complementing Brightness Constancy with Deep Networks for Optical Flow Prediction
Vincent Le Guen, Clément Rambour, Nicolas Thome
[pdf]
[DOI]

ViTAS: Vision Transformer Architecture Search
Xiu Su, Shan You, Jiyang Xie, Mingkai Zheng, Fei Wang, Chen Qian, Changshui Zhang, Xiaogang Wang, Chang Xu
[pdf]
[DOI]

LidarNAS: Unifying and Searching Neural Architectures for 3D Point Clouds
Chenxi Liu, Zhaoqi Leng, Pei Sun, Shuyang Cheng, Charles R. Qi, Yin Zhou, Mingxing Tan, Dragomir Anguelov
[pdf]
[DOI]

Uncertainty-DTW for Time Series and Sequences
Lei Wang, Piotr Koniusz
[pdf]
[DOI]

Black-Box Few-Shot Knowledge Distillation
Dang Nguyen, Sunil Gupta, Kien Do, Svetha Venkatesh
[pdf]
[DOI]

Revisiting Batch Norm Initialization
Jim Davis, Logan Frank
[pdf]
[DOI]

SSBNet: Improving Visual Recognition Efficiency by Adaptive Sampling
Ho Man Kwan, Shenghui Song
[pdf]
[DOI]

Filter Pruning via Feature Discrimination in Deep Neural Networks
Zhiqiang He, Yaguan Qian, Yuqi Wang, Bin Wang, Xiaohui Guan, Zhaoquan Gu, Xiang Ling, Shaoning Zeng, Haijiang Wang, Wujie Zhou
[pdf]
[DOI]

LA3: Efficient Label-Aware AutoAugment
Mingjun Zhao, Shan Lu, Zixuan Wang, Xiaoli Wang, Di Niu
[pdf]
[DOI]

Interpretations Steered Network Pruning via Amortized Inferred Saliency Maps
Alireza Ganjdanesh, Shangqian Gao, Heng Huang
[pdf]
[DOI]

BA-Net: Bridge Attention for Deep Convolutional Neural Networks
Yue Zhao, Junzhou Chen, Zirui Zhang, Ronghui Zhang
[pdf]
[DOI]

SAU: Smooth Activation Function Using Convolution with Approximate Identities
Koushik Biswas, Sandeep Kumar, Shilpak Banerjee, Ashish Kumar Pandey
[pdf]
[DOI]

Multi-Exit Semantic Segmentation Networks
Alexandros Kouris, Stylianos I. Venieris, Stefanos Laskaridis, Nicholas Lane
[pdf]
[DOI]

Almost-Orthogonal Layers for Efficient General-Purpose Lipschitz Networks
Bernd Prach, Christoph H. Lampert
[pdf]
[DOI]

PointScatter: Point Set Representation for Tubular Structure Extraction
Dong Wang, Zhao Zhang, Ziwei Zhao, Yuhang Liu, Yihong Chen, Liwei Wang
[pdf]
[DOI]

Check and Link: Pairwise Lesion Correspondence Guides Mammogram Mass Detection
Ziwei Zhao, Dong Wang, Yihong Chen, Ziteng Wang, Liwei Wang
[pdf]
[DOI]

Graph-Constrained Contrastive Regularization for Semi-Weakly Volumetric Segmentation
Simon Reiß, Constantin Seibold, Alexander Freytag, Erik Rodner, Rainer Stiefelhagen
[pdf]
[DOI]

Generalizable Medical Image Segmentation via Random Amplitude Mixup and Domain-Specific Image Restoration
Ziqi Zhou, Lei Qi, Yinghuan Shi
[pdf]
[DOI]

Auto-FedRL: Federated Hyperparameter Optimization for Multi-Institutional Medical Image Segmentation
Pengfei Guo, Dong Yang, Ali Hatamizadeh, An Xu, Ziyue Xu, Wenqi Li, Can Zhao, Daguang Xu, Stephanie Harmon, Evrim Turkbey, Baris Turkbey, Bradford Wood, Francesca Patella, Elvira Stellato, Gianpaolo Carrafiello, Vishal M. Patel, Holger R. Roth
[pdf]
[DOI]

Personalizing Federated Medical Image Segmentation via Local Calibration
Jiacheng Wang, Yueming Jin, Liansheng Wang
[pdf]
[DOI]

One-Shot Medical Landmark Localization by Edge-Guided Transform and Noisy Landmark Refinement
Zihao Yin, Ping Gong, Chunyu Wang, Yizhou Yu, Yizhou Wang
[pdf]
[DOI]

Ultra-High-Resolution Unpaired Stain Transformation via Kernelized Instance Normalization
Ming-Yang Ho, Min-Sheng Wu, Che-Ming Wu
[pdf]
[DOI]

Med-DANet: Dynamic Architecture Network for Efficient Medical Volumetric Segmentation
Wenxuan Wang, Chen Chen, Jing Wang, Sen Zha, Yan Zhang, Jiangyun Li
[pdf]
[DOI]

ConCL: Concept Contrastive Learning for Dense Prediction Pre-training in Pathology Images
Jiawei Yang, Hanbo Chen, Yuan Liang, Junzhou Huang, Lei He, Jianhua Yao
[pdf]
[DOI]

CryoAI: Amortized Inference of Poses for Ab Initio Reconstruction of 3D Molecular Volumes from Real Cryo-EM Images
Axel Levy, Frédéric Poitevin, Julien Martel, Youssef Nashed, Ariana Peck, Nina Miolane, Daniel Ratner, Mike Dunne, Gordon Wetzstein
[pdf]
[DOI]

UniMiSS: Universal Medical Self-Supervised Learning via Breaking Dimensionality Barrier
Yutong Xie, Jianpeng Zhang, Yong Xia, Qi Wu
[pdf]
[DOI]

DLME: Deep Local-Flatness Manifold Embedding
Zelin Zang, Siyuan Li, Di Wu, Ge Wang, Kai Wang, Lei Shang, Baigui Sun, Hao Li, Stan Z. Li
[pdf]
[DOI]

Semi-Supervised Keypoint Detector and Descriptor for Retinal Image Matching
Jiazhen Liu, Xirong Li, Qijie Wei, Jie Xu, Dayong Ding
[pdf]
[DOI]

Graph Neural Network for Cell Tracking in Microscopy Videos
Tal Ben-Haim, Tammy Riklin Raviv
[pdf]
[DOI]

CXR Segmentation by AdaIN-Based Domain Adaptation and Knowledge Distillation
Yujin Oh, Jong Chul Ye
[pdf]
[DOI]

Accurate Detection of Proteins in Cryo-Electron Tomograms from Sparse Labels
Qinwen Huang, Ye Zhou, Hsuan-Fu Liu, Alberto Bartesaghi
[pdf]
[DOI]

K-SALSA: K-Anonymous Synthetic Averaging of Retinal Images via Local Style Alignment
Minkyu Jeon, Hyeonjin Park, Hyunwoo J. Kim, Michael Morley, Hyunghoon Cho
[pdf]
[DOI]

RadioTransformer: A Cascaded Global-Focal Transformer for Visual Attention-Guided Disease Classification
Moinak Bhattacharya, Shubham Jain, Prateek Prasanna
[pdf]
[DOI]

Differentiable Zooming for Multiple Instance Learning on Whole-Slide Images
Kevin Thandiackal, Boqi Chen, Pushpak Pati, Guillaume Jaume, Drew F. K. Williamson, Maria Gabrani, Orcun Goksel
[pdf]
[DOI]

Learning Uncoupled-Modulation CVAE for 3D Action-Conditioned Human Motion Synthesis
Chongyang Zhong, Lei Hu, Zihao Zhang, Shihong Xia
[pdf]
[DOI]

Towards Grand Unification of Object Tracking
Bin Yan, Yi Jiang, Peize Sun, Dong Wang, Zehuan Yuan, Ping Luo, Huchuan Lu
[pdf]
[DOI]

ByteTrack: Multi-Object Tracking by Associating Every Detection Box
Yifu Zhang, Peize Sun, Yi Jiang, Dongdong Yu, Fucheng Weng, Zehuan Yuan, Ping Luo, Wenyu Liu, Xinggang Wang
[pdf]
[DOI]

Robust Multi-Object Tracking by Marginal Inference
Yifu Zhang, Chunyu Wang, Xinggang Wang, Wenjun Zeng, Wenyu Liu
[pdf]
[DOI]

PolarMOT: How Far Can Geometric Relations Take Us in 3D Multi-Object Tracking?
Aleksandr Kim, Guillem Brasó, Aljoša Ošep, Laura Leal-Taixé
[pdf]
[DOI]

Particle Video Revisited: Tracking through Occlusions Using Point Trajectories
Adam W. Harley, Zhaoyuan Fang, Katerina Fragkiadaki
[pdf]
[DOI]

Tracking Objects As Pixel-Wise Distributions
Zelin Zhao, Ze Wu, Yueqing Zhuang, Boxun Li, Jiaya Jia
[pdf]
[DOI]

CMT: Context-Matching-Guided Transformer for 3D Tracking in Point Clouds
Zhiyang Guo, Yunyao Mao, Wengang Zhou, Min Wang, Houqiang Li
[pdf]
[DOI]

Towards Generic 3D Tracking in RGBD Videos: Benchmark and Baseline
Jinyu Yang, Zhongqun Zhang, Zhe Li, Hyung Jin Chang, Aleš Leonardis, Feng Zheng
[pdf]
[DOI]

Hierarchical Latent Structure for Multi-modal Vehicle Trajectory Forecasting
Dooseop Choi, KyoungWook Min
[pdf]
[DOI]

AiATrack: Attention in Attention for Transformer Visual Tracking
Shenyuan Gao, Chunluan Zhou, Chao Ma, Xinggang Wang, Junsong Yuan
[pdf]
[DOI]

Disentangling Architecture and Training for Optical Flow
Deqing Sun, Charles Herrmann, Fitsum Reda, Michael Rubinstein, David J. Fleet, William T. Freeman
[pdf]
[DOI]

A Perturbation-Constrained Adversarial Attack for Evaluating the Robustness of Optical Flow
Jenny Schmalfuss, Philipp Scholze, Andrés Bruhn
[pdf]
[DOI]

Robust Landmark-Based Stent Tracking in X-Ray Fluoroscopy
Luojie Huang, Yikang Liu, Li Chen, Eric Z. Chen, Xiao Chen, Shanhui Sun
[pdf]
[DOI]

Social ODE: Multi-agent Trajectory Forecasting with Neural Ordinary Differential Equations
Song Wen, Hao Wang, Dimitris N. Metaxas
[pdf]
[DOI]

Social-SSL: Self-Supervised Cross-Sequence Representation Learning Based on Transformers for Multi-agent Trajectory Prediction
Li-Wu Tsao, Yan-Kai Wang, Hao-Siang Lin, Hong-Han Shuai, Lai-Kuan Wong, Wen-Huang Cheng
[pdf]
[DOI]

Diverse Human Motion Prediction Guided by Multi-level Spatial-Temporal Anchors
Sirui Xu, Yu-Xiong Wang, Liang-Yan Gui
[pdf]
[DOI]

Learning Pedestrian Group Representations for Multi-modal Trajectory Prediction
Inhwan Bae, Jin-Hwi Park, Hae-Gon Jeon
[pdf]
[DOI]

Sequential Multi-View Fusion Network for Fast LiDAR Point Motion Estimation
Gang Zhang, Xiaoyan Li, Zhenhua Wang
[pdf]
[DOI]

E-Graph: Minimal Solution for Rigid Rotation with Extensibility Graphs
Yanyan Li, Federico Tombari
[pdf]
[DOI]

Point Cloud Compression with Range Image-Based Entropy Model for Autonomous Driving
Sukai Wang, Ming Liu
[pdf]
[DOI]

Joint Feature Learning and Relation Modeling for Tracking: A One-Stream Framework
Botao Ye, Hong Chang, Bingpeng Ma, Shiguang Shan, Xilin Chen
[pdf]
[DOI]

MotionCLIP: Exposing Human Motion Generation to CLIP Space
Guy Tevet, Brian Gordon, Amir Hertz, Amit H. Bermano, Daniel Cohen-Or
[pdf]
[DOI]

Backbone Is All Your Need: A Simplified Architecture for Visual Object Tracking
Boyu Chen, Peixia Li, Lei Bai, Lei Qiao, Qiuhong Shen, Bo Li, Weihao Gan, Wei Wu, Wanli Ouyang
[pdf]
[DOI]

Aware of the History: Trajectory Forecasting with the Local Behavior Data
Yiqi Zhong, Zhenyang Ni, Siheng Chen, Ulrich Neumann
[pdf]
[DOI]

Optical Flow Training under Limited Label Budget via Active Learning
Shuai Yuan, Xian Sun, Hannah Kim, Shuzhi Yu, Carlo Tomasi
[pdf]
[DOI]

Hierarchical Feature Embedding for Visual Tracking
Zhixiong Pi, Weitao Wan, Chong Sun, Changxin Gao, Nong Sang, Chen Li
[pdf]
[DOI]

Tackling Background Distraction in Video Object Segmentation
Suhwan Cho, Heansung Lee, Minhyeok Lee, Chaewon Park, Sungjun Jang, Minjung Kim, Sangyoun Lee
[pdf]
[DOI]

Social-Implicit: Rethinking Trajectory Prediction Evaluation and the Effectiveness of Implicit Maximum Likelihood Estimation
Abduallah Mohamed, Deyao Zhu, Warren Vu, Mohamed Elhoseiny, Christian Claudel
[pdf]
[DOI]

TEMOS: Generating Diverse Human Motions from Textual Descriptions
Mathis Petrovich, Michael J. Black, Gül Varol
[pdf]
[DOI]

Tracking Every Thing in the Wild
Siyuan Li, Martin Danelljan, Henghui Ding, Thomas E. Huang, Fisher Yu
[pdf]
[DOI]

HULC: 3D HUman Motion Capture with Pose Manifold SampLing and Dense Contact Guidance
Soshi Shimada, Vladislav Golyanik, Zhi Li, Patrick Pérez, Weipeng Xu, Christian Theobalt
[pdf]
[DOI]

Towards Sequence-Level Training for Visual Tracking
Minji Kim, Seungkwan Lee, Jungseul Ok, Bohyung Han, Minsu Cho
[pdf]
[DOI]

Learned Monocular Depth Priors in Visual-Inertial Initialization
Yunwen Zhou, Abhishek Kar, Eric Turner, Adarsh Kowdle, Chao X. Guo, Ryan C. DuToit, Konstantine Tsotsos
[pdf]
[DOI]

Robust Visual Tracking by Segmentation
Matthieu Paul, Martin Danelljan, Christoph Mayer, Luc Van Gool
[pdf]
[DOI]

MeshLoc: Mesh-Based Visual Localization
Vojtech Panek, Zuzana Kukelova, Torsten Sattler
[pdf]
[DOI]

S2F2: Single-Stage Flow Forecasting for Future Multiple Trajectories Prediction
Yu-Wen Chen, Hsuan-Kung Yang, Chu-Chi Chiu, Chun-Yi Lee
[pdf]
[DOI]

Large-Displacement 3D Object Tracking with Hybrid Non-local Optimization
Xuhui Tian, Xinran Lin, Fan Zhong, Xueying Qin
[pdf]
[DOI]

"FEAR: Fast, Efficient, Accurate and Robust Visual Tracker"
Vasyl Borsuk, Roman Vei, Orest Kupyn, Tetiana Martyniuk, Igor Krashenyi, Jiři Matas
[pdf]
[DOI]

PREF: Predictability Regularized Neural Motion Fields
Liangchen Song, Xuan Gong, Benjamin Planche, Meng Zheng, David Doermann, Junsong Yuan, Terrence Chen, Ziyan Wu
[pdf]
[DOI]

View Vertically: A Hierarchical Network for Trajectory Prediction via Fourier Spectrums
Conghao Wong, Beihao Xia, Ziming Hong, Qinmu Peng, Wei Yuan, Qiong Cao, Yibo Yang, Xinge You
[pdf]
[DOI]

"HVC-Net: Unifying Homography, Visibility, and Confidence Learning for Planar Object Tracking"
Haoxian Zhang, Yonggen Ling
[pdf]
[DOI]

RamGAN: Region Attentive Morphing GAN for Region-Level Makeup Transfer
Jianfeng Xiang, Junliang Chen, Wenshuang Liu, Xianxu Hou, Linlin Shen
[pdf]
[DOI]

SinNeRF: Training Neural Radiance Fields on Complex Scenes from a Single Image
Dejia Xu, Yifan Jiang, Peihao Wang, Zhiwen Fan, Humphrey Shi, Zhangyang Wang
[pdf]
[DOI]

Entropy-Driven Sampling and Training Scheme for Conditional Diffusion Generation
Guangcong Zheng, Shengming Li, Hui Wang, Taiping Yao, Yang Chen, Shouhong Ding, Xi Li
[pdf]
[DOI]

Accelerating Score-Based Generative Models with Preconditioned Diffusion Sampling
Hengyuan Ma, Li Zhang, Xiatian Zhu, Jianfeng Feng
[pdf]
[DOI]

Learning to Generate Realistic LiDAR Point Clouds
Vlas Zyrianov, Xiyue Zhu, Shenlong Wang
[pdf]
[DOI]

RFNet-4D: Joint Object Reconstruction and Flow Estimation from 4D Point Clouds
Tuan-Anh Vu, Thanh Nguyen, Binh-Son Hua, Quang-Hieu Pham, Sai-Kit Yeung
[pdf]
[DOI]

Diverse Image Inpainting with Normalizing Flow
Cairong Wang, Yiming Zhu, Chun Yuan
[pdf]
[DOI]

Improved Masked Image Generation with Token-Critic
José Lezama, Huiwen Chang, Lu Jiang, Irfan Essa
[pdf]
[DOI]

TREND: Truncated Generalized Normal Density Estimation of Inception Embeddings for GAN Evaluation
Junghyuk Lee, Jong-Seok Lee
[pdf]
[DOI]

Exploring Gradient-Based Multi-directional Controls in GANs
Zikun Chen, Ruowei Jiang, Brendan Duke, Han Zhao, Parham Aarabi
[pdf]
[DOI]

Spatially Invariant Unsupervised 3D Object-Centric Learning and Scene Decomposition
Tianyu Wang, Miaomiao Liu, Kee Siong Ng
[pdf]
[DOI]

Neural Scene Decoration from a Single Photograph
Hong-Wing Pang, Yingshu Chen, Phuoc-Hieu Le, Binh-Son Hua, Thanh Nguyen, Sai-Kit Yeung
[pdf]
[DOI]

Outpainting by Queries
Kai Yao, Penglei Gao, Xi Yang, Jie Sun, Rui Zhang, Kaizhu Huang
[pdf]
[DOI]

Unleashing Transformers: Parallel Token Prediction with Discrete Absorbing Diffusion for Fast High-Resolution Image Generation from Vector-Quantized Codes
Sam Bond-Taylor, Peter Hessey, Hiroshi Sasaki, Toby P. Breckon, Chris G. Willcocks
[pdf]
[DOI]

ChunkyGAN: Real Image Inversion via Segments
Adéla Šubrtová, David Futschik, Jan Čech, Michal Lukáč, Eli Shechtman, Daniel Sýkora
[pdf]
[DOI]

GAN Cocktail: Mixing GANs without Dataset Access
Omri Avrahami, Dani Lischinski, Ohad Fried
[pdf]
[DOI]

Geometry-Guided Progressive NeRF for Generalizable and Efficient Neural Human Rendering
Mingfei Chen, Jianfeng Zhang, Xiangyu Xu, Lijuan Liu, Yujun Cai, Jiashi Feng, Shuicheng Yan
[pdf]
[DOI]

Controllable Shadow Generation Using Pixel Height Maps
Yichen Sheng, Yifan Liu, Jianming Zhang, Wei Yin, A. Cengiz Oztireli, He Zhang, Zhe Lin, Eli Shechtman, Bedrich Benes
[pdf]
[DOI]

Learning Where to Look – Generative NAS Is Surprisingly Efficient
Jovita Lukasik, Steffen Jung, Margret Keuper
[pdf]
[DOI]

Subspace Diffusion Generative Models
Bowen Jing, Gabriele Corso, Renato Berlinghieri, Tommi Jaakkola
[pdf]
[DOI]

DuelGAN: A Duel between Two Discriminators Stabilizes the GAN Training
Jiaheng Wei, Minghao Liu, Jiahao Luo, Andrew Zhu, James Davis, Yang Liu
[pdf]
[DOI]

MINER: Multiscale Implicit Neural Representation
Vishwanath Saragadam, Jasper Tan, Guha Balakrishnan, Richard G. Baraniuk, Ashok Veeraraghavan
[pdf]
[DOI]

An Embedded Feature Whitening Approach to Deep Neural Network Optimization
Hongwei Yong, Lei Zhang
[pdf]
[DOI]

Q-FW: A Hybrid Classical-Quantum Frank-Wolfe for Quadratic Binary Optimization
Alp Yurtsever, Tolga Birdal, Vladislav Golyanik
[pdf]
[DOI]

Self-Supervised Learning of Visual Graph Matching
Chang Liu, Shaofeng Zhang, Xiaokang Yang, Junchi Yan
[pdf]
[DOI]

Scalable Learning to Optimize: A Learned Optimizer Can Train Big Models
Xuxi Chen, Tianlong Chen, Yu Cheng, Weizhu Chen, Ahmed Awadallah, Zhangyang Wang
[pdf]
[DOI]

QISTA-ImageNet: A Deep Compressive Image Sensing Framework Solving lq-Norm Optimization Problem
Gang-Xuan Lin, Shih-Wei Hu, Chun-Shien Lu
[pdf]
[DOI]

R-DFCIL: Relation-Guided Representation Learning for Data-Free Class Incremental Learning
Qiankun Gao, Chen Zhao, Bernard Ghanem, Jian Zhang
[pdf]
[DOI]

Domain Generalization by Mutual-Information Regularization with Pre-trained Models
Junbum Cha, Kyungjae Lee, Sungrae Park, Sanghyuk Chun
[pdf]
[DOI]

Predicting Is Not Understanding: Recognizing and Addressing Underspecification in Machine Learning
Damien Teney, Maxime Peyrard, Ehsan Abbasnejad
[pdf]
[DOI]

Neural-Sim: Learning to Generate Training Data with NeRF
Yunhao Ge, Harkirat Behl, Jiashu Xu, Suriya Gunasekar, Neel Joshi, Yale Song, Xin Wang, Laurent Itti, Vibhav Vineet
[pdf]
[DOI]

Bayesian Optimization with Clustering and Rollback for CNN Auto Pruning
Hanwei Fan, Jiandong Mu, Wei Zhang
[pdf]
[DOI]

Learned Variational Video Color Propagation
Markus Hofinger, Erich Kobler, Alexander Effland, Thomas Pock
[pdf]
[DOI]

Continual Variational Autoencoder Learning via Online Cooperative Memorization
Fei Ye, Adrian G. Bors
[pdf]
[DOI]

Learning to Learn with Smooth Regularization
Yuanhao Xiong, Cho-Jui Hsieh
[pdf]
[DOI]

Incremental Task Learning with Incremental Rank Updates
Rakib Hyder, Ken Shao, Boyu Hou, Panos Markopoulos, Ashley Prater-Bennette, M. Salman Asif
[pdf]
[DOI]

Batch-Efficient EigenDecomposition for Small and Medium Matrices
Yue Song, Nicu Sebe, Wei Wang
[pdf]
[DOI]

Ensemble Learning Priors Driven Deep Unfolding for Scalable Video Snapshot Compressive Imaging
Chengshuai Yang, Shiyu Zhang, Xin Yuan
[pdf]
[DOI]

Approximate Discrete Optimal Transport Plan with Auxiliary Measure Method
Dongsheng An, Na Lei, Xianfeng Gu
[pdf]
[DOI]

A Comparative Study of Graph Matching Algorithms in Computer Vision
Stefan Haller, Lorenz Feineis, Lisa Hutschenreiter, Florian Bernard, Carsten Rother, Dagmar Kainmüller, Paul Swoboda, Bogdan Savchynskyy
[pdf]
[DOI]

Improving Generalization in Federated Learning by Seeking Flat Minima
Debora Caldarola, Barbara Caputo, Marco Ciccone
[pdf]
[DOI]

Semidefinite Relaxations of Truncated Least-Squares in Robust Rotation Search: Tight or Not
Liangzu Peng, Mahyar Fazlyab, René Vidal
[pdf]
[DOI]

Transfer without Forgetting
Matteo Boschini, Lorenzo Bonicelli, Angelo Porrello, Giovanni Bellitto, Matteo Pennisi, Simone Palazzo, Concetto Spampinato, Simone Calderara
[pdf]
[DOI]

AdaBest: Minimizing Client Drift in Federated Learning via Adaptive Bias Estimation
Farshid Varno, Marzie Saghayi, Laya Rafiee Sevyeri, Sharut Gupta, Stan Matwin, Mohammad Havaei
[pdf]
[DOI]

Tackling Long-Tailed Category Distribution under Domain Shifts
Xiao Gu, Yao Guo, Zeju Li, Jianing Qiu, Qi Dou, Yuxuan Liu, Benny Lo, Guang-Zhong Yang
[pdf]
[DOI]

Doubly-Fused ViT: Fuse Information from Vision Transformer Doubly with Local Representation
Li Gao, Dong Nie, Bo Li, Xiaofeng Ren
[pdf]
[DOI]

Improving Vision Transformers by Revisiting High-Frequency Components
Jiawang Bai, Li Yuan, Shu-Tao Xia, Shuicheng Yan, Zhifeng Li, Wei Liu
[pdf]
[DOI]

Recurrent Bilinear Optimization for Binary Neural Networks
Sheng Xu, Yanjing Li, Tiancheng Wang, Teli Ma, Baochang Zhang, Peng Gao, Yu Qiao, Jinhu Lü, Guodong Guo
[pdf]
[DOI]

Neural Architecture Search for Spiking Neural Networks
Youngeun Kim, Yuhang Li, Hyoungseob Park, Yeshwanth Venkatesha, Priyadarshini Panda
[pdf]
[DOI]

Where to Focus: Investigating Hierarchical Attention Relationship for Fine-Grained Visual Classification
Yang Liu, Lei Zhou, Pengcheng Zhang, Xiao Bai, Lin Gu, Xiaohan Yu, Jun Zhou, Edwin R. Hancock
[pdf]
[DOI]

DaViT: Dual Attention Vision Transformers
Mingyu Ding, Bin Xiao, Noel Codella, Ping Luo, Jingdong Wang, Lu Yuan
[pdf]
[DOI]

Optimal Transport for Label-Efficient Visible-Infrared Person Re-identification
Jiangming Wang, Zhizhong Zhang, Mingang Chen, Yi Zhang, Cong Wang, Bin Sheng, Yanyun Qu, Yuan Xie
[pdf]
[DOI]

Locality Guidance for Improving Vision Transformers on Tiny Datasets
Kehan Li, Runyi Yu, Zhennan Wang, Li Yuan, Guoli Song, Jie Chen
[pdf]
[DOI]

Neighborhood Collective Estimation for Noisy Label Identification and Correction
Jichang Li, Guanbin Li, Feng Liu, Yizhou Yu
[pdf]
[DOI]

Few-Shot Class-Incremental Learning via Entropy-Regularized Data-Free Replay
Huan Liu, Li Gu, Zhixiang Chi, Yang Wang, Yuanhao Yu, Jun Chen, Jin Tang
[pdf]
[DOI]

Anti-Retroactive Interference for Lifelong Learning
Runqi Wang, Yuxiang Bao, Baochang Zhang, Jianzhuang Liu, Wentao Zhu, Guodong Guo
[pdf]
[DOI]

Towards Calibrated Hyper-Sphere Representation via Distribution Overlap Coefficient for Long-Tailed Learning
Hualiang Wang, Siming Fu, Xiaoxuan He, Hangxiang Fang, Zuozhu Liu, Haoji Hu
[pdf]
[DOI]

Dynamic Metric Learning with Cross-Level Concept Distillation
Wenzhao Zheng, Yuanhui Huang, Borui Zhang, Jie Zhou, Jiwen Lu
[pdf]
[DOI]

MENet: A Memory-Based Network with Dual-Branch for Efficient Event Stream Processing
Linhui Sun, Yifan Zhang, Ke Cheng, Jian Cheng, Hanqing Lu
[pdf]
[DOI]

Out-of-Distribution Detection with Boundary Aware Learning
Sen Pei, Xin Zhang, Bin Fan, Gaofeng Meng
[pdf]
[DOI]

Learning Hierarchy Aware Features for Reducing Mistake Severity
Ashima Garg, Depanshu Sani, Saket Anand
[pdf]
[DOI]

Learning to Detect Every Thing in an Open World
Kuniaki Saito, Ping Hu, Trevor Darrell, Kate Saenko
[pdf]
[DOI]

KVT: k-NN Attention for Boosting Vision Transformers
Pichao Wang, Xue Wang, Fan Wang, Ming Lin, Shuning Chang, Hao Li, Rong Jin
[pdf]
[DOI]

Registration Based Few-Shot Anomaly Detection
Chaoqin Huang, Haoyan Guan, Aofan Jiang, Ya Zhang, Michael Spratling, Yan-Feng Wang
[pdf]
[DOI]

Improving Robustness by Enhancing Weak Subnets
Yong Guo, David Stutz, Bernt Schiele
[pdf]
[DOI]

Learning Invariant Visual Representations for Compositional Zero-Shot Learning
Tian Zhang, Kongming Liang, Ruoyi Du, Xian Sun, Zhanyu Ma, Jun Guo
[pdf]
[DOI]

Improving Covariance Conditioning of the SVD Meta-Layer by Orthogonality
Yue Song, Nicu Sebe, Wei Wang
[pdf]
[DOI]

Out-of-Distribution Detection with Semantic Mismatch under Masking
Yijun Yang, Ruiyuan Gao, Qiang Xu
[pdf]
[DOI]

Data-Free Neural Architecture Search via Recursive Label Calibration
Zechun Liu, Zhiqiang Shen, Yun Long, Eric Xing, Kwang-Ting Cheng, Chas Leichner
[pdf]
[DOI]

Learning from Multiple Annotator Noisy Labels via Sample-Wise Label Fusion
Zhengqi Gao, Fan-Keng Sun, Mingran Yang, Sucheng Ren, Zikai Xiong, Marc Engeler, Antonio Burazer, Linda Wildling, Luca Daniel, Duane S. Boning
[pdf]
[DOI]

Acknowledging the Unknown for Multi-Label Learning with Single Positive Labels
Donghao Zhou, Pengfei Chen, Qiong Wang, Guangyong Chen, Pheng-Ann Heng
[pdf]
[DOI]

AutoMix: Unveiling the Power of Mixup for Stronger Classifiers
Zicheng Liu, Siyuan Li, Di Wu, Zihan Liu, Zhiyuan Chen, Lirong Wu, Stan Z. Li
[pdf]
[DOI]

MaxViT: Multi-axis Vision Transformer
Zhengzhong Tu, Hossein Talebi, Han Zhang, Feng Yang, Peyman Milanfar, Alan Bovik, Yinxiao Li
[pdf]
[DOI]

ScalableViT: Rethinking the Context-Oriented Generalization of Vision Transformer
Rui Yang, Hailong Ma, Jie Wu, Yansong Tang, Xuefeng Xiao, Min Zheng, Xiu Li
[pdf]
[DOI]

Three Things Everyone Should Know about Vision Transformers
Hugo Touvron, Matthieu Cord, Alaaeldin El-Nouby, Jakob Verbeek, Hervé Jégou
[pdf]
[DOI]

DeiT III: Revenge of the ViT
Hugo Touvron, Matthieu Cord, Hervé Jégou
[pdf]
[DOI]

MixSKD: Self-Knowledge Distillation from Mixup for Image Recognition
Chuanguang Yang, Zhulin An, Helong Zhou, Linhang Cai, Xiang Zhi, Jiwen Wu, Yongjun Xu, Qian Zhang
[pdf]
[DOI]

Self-Feature Distillation with Uncertainty Modeling for Degraded Image Recognition
Zhou Yang, Weisheng Dong, Xin Li, Jinjian Wu, Leida Li, Guangming Shi
[pdf]
[DOI]

Novel Class Discovery without Forgetting
K J Joseph, Sujoy Paul, Gaurav Aggarwal, Soma Biswas, Piyush Rai, Kai Han, Vineeth N Balasubramanian
[pdf]
[DOI]

SAFA: Sample-Adaptive Feature Augmentation for Long-Tailed Image Classification
Yan Hong, Jianfu Zhang, Zhongyi Sun, Ke Yan
[pdf]
[DOI]

Negative Samples Are at Large: Leveraging Hard-Distance Elastic Loss for Re-identification
Hyungtae Lee, Sungmin Eum, Heesung Kwon
[pdf]
[DOI]

Discrete-Constrained Regression for Local Counting Models
Haipeng Xiong, Angela Yao
[pdf]
[DOI]

Breadcrumbs: Adversarial Class-Balanced Sampling for Long-Tailed Recognition
Bo Liu, Haoxiang Li, Hao Kang, Gang Hua, Nuno Vasconcelos
[pdf]
[DOI]

Chairs Can Be Stood On: Overcoming Object Bias in Human-Object Interaction Detection
Guangzhi Wang, Yangyang Guo, Yongkang Wong, Mohan Kankanhalli
[pdf]
[DOI]

A Fast Knowledge Distillation Framework for Visual Recognition
Zhiqiang Shen, Eric Xing
[pdf]
[DOI]

DICE: Leveraging Sparsification for Out-of-Distribution Detection
Yiyou Sun, Yixuan Li
[pdf]
[DOI]

Invariant Feature Learning for Generalized Long-Tailed Classification
Kaihua Tang, Mingyuan Tao, Jiaxin Qi, Zhenguang Liu, Hanwang Zhang
[pdf]
[DOI]

Sliced Recursive Transformer
Zhiqiang Shen, Zechun Liu, Eric Xing
[pdf]
[DOI]

Cross-Domain Ensemble Distillation for Domain Generalization
Kyungmoon Lee, Sungyeon Kim, Suha Kwak
[pdf]
[DOI]

Centrality and Consistency: Two-Stage Clean Samples Identification for Learning with Instance-Dependent Noisy Labels
Ganlong Zhao, Guanbin Li, Yipeng Qin, Feng Liu, Yizhou Yu
[pdf]
[DOI]

Hyperspherical Learning in Multi-Label Classification
Bo Ke, Yunquan Zhu, Mengtian Li, Xiujun Shu, Ruizhi Qiao, Bo Ren
[pdf]
[DOI]

When Active Learning Meets Implicit Semantic Data Augmentation
Zhuangzhuang Chen, Jin Zhang, Pan Wang, Jie Chen, Jianqiang Li
[pdf]
[DOI]

VL-LTR: Learning Class-Wise Visual-Linguistic Representation for Long-Tailed Visual Recognition
Changyao Tian, Wenhai Wang, Xizhou Zhu, Jifeng Dai, Yu Qiao
[pdf]
[DOI]

Class Is Invariant to Context and Vice Versa: On Learning Invariance for Out-of-Distribution Generalization
Jiaxin Qi, Kaihua Tang, Qianru Sun, Xian-Sheng Hua, Hanwang Zhang
[pdf]
[DOI]

Hierarchical Semi-Supervised Contrastive Learning for Contamination-Resistant Anomaly Detection
Gaoang Wang, Yibing Zhan, Xinchao Wang, Mingli Song, Klara Nahrstedt
[pdf]
[DOI]

Tracking by Associating Clips
Sanghyun Woo, Kwanyong Park, Seoung Wug Oh, In So Kweon, Joon-Young Lee
[pdf]
[DOI]

RealPatch: A Statistical Matching Framework for Model Patching with Real Samples
Sara Romiti, Christopher Inskip, Viktoriia Sharmanska, Novi Quadrianto
[pdf]
[DOI]

Background-Insensitive Scene Text Recognition with Text Semantic Segmentation
Liang Zhao, Zhenyao Wu, Xinyi Wu, Greg Wilsbacher, Song Wang
[pdf]
[DOI]

Semantic Novelty Detection via Relational Reasoning
Francesco Cappio Borlino, Silvia Bucci, Tatiana Tommasi
[pdf]
[DOI]

Improving Closed and Open-Vocabulary Attribute Prediction Using Transformers
Khoi Pham, Kushal Kafle, Zhe Lin, Zhihong Ding, Scott Cohen, Quan Tran, Abhinav Shrivastava
[pdf]
[DOI]

Training Vision Transformers with Only 2040 Images
Yun-Hao Cao, Hao Yu, Jianxin Wu
[pdf]
[DOI]

Bridging Images and Videos: A Simple Learning Framework for Large Vocabulary Video Object Detection
Sanghyun Woo, Kwanyong Park, Seoung Wug Oh, In So Kweon, Joon-Young Lee
[pdf]
[DOI]

TDAM: Top-Down Attention Module for Contextually Guided Feature Selection in CNNs
Shantanu Jaiswal, Basura Fernando, Cheston Tan
[pdf]
[DOI]

Automatic Check-Out via Prototype-Based Classifier Learning from Single-Product Exemplars
Hao Chen, Xiu-Shen Wei, Faen Zhang, Yang Shen, Hui Xu, Liang Xiao
[pdf]
[DOI]

Overcoming Shortcut Learning in a Target Domain by Generalizing Basic Visual Factors from a Source Domain
Piyapat Saranrittichai, Chaithanya Kumar Mummadi, Claudia Blaiotta, Mauricio Munoz, Volker Fischer
[pdf]
[DOI]

Photo-Realistic Neural Domain Randomization
Sergey Zakharov, Rareș Ambruș, Vitor Guizilini, Wadim Kehl, Adrien Gaidon
[pdf]
[DOI]

Wave-ViT: Unifying Wavelet and Transformers for Visual Representation Learning
Ting Yao, Yingwei Pan, Yehao Li, Chong-Wah Ngo, Tao Mei
[pdf]
[DOI]

Tailoring Self-Supervision for Supervised Learning
WonJun Moon, Ji-Hwan Kim, Jae-Pil Heo
[pdf]
[DOI]

Difficulty-Aware Simulator for Open Set Recognition
WonJun Moon, Junho Park, Hyun Seok Seong, Cheol-Ho Cho, Jae-Pil Heo
[pdf]
[DOI]

Few-Shot Class-Incremental Learning from an Open-Set Perspective
Can Peng, Kun Zhao, Tianren Wang, Meng Li, Brian C. Lovell
[pdf]
[DOI]

FOSTER: Feature Boosting and Compression for Class-Incremental Learning
Fu-Yun Wang, Da-Wei Zhou, Han-Jia Ye, De-Chuan Zhan
[pdf]
[DOI]

Visual Knowledge Tracing
Neehar Kondapaneni, Pietro Perona, Oisin Mac Aodha
[pdf]
[DOI]

S3C: Self-Supervised Stochastic Classifiers for Few-Shot Class-Incremental Learning
Jayateja Kalla, Soma Biswas
[pdf]
[DOI]

Improving Fine-Grained Visual Recognition in Low Data Regimes via Self-Boosting Attention Mechanism
Yangyang Shu, Baosheng Yu, Haiming Xu, Lingqiao Liu
[pdf]
[DOI]

VSA: Learning Varied-Size Window Attention in Vision Transformers
Qiming Zhang, Yufei Xu, Jing Zhang, Dacheng Tao
[pdf]
[DOI]

Unbiased Manifold Augmentation for Coarse Class Subdivision
Baoming Yan, Ke Gao, Bo Gao, Lin Wang, Jiang Yang, Xiaobo Li
[pdf]
[DOI]

DenseHybrid: Hybrid Anomaly Detection for Dense Open-Set Recognition
Matej Grcić, Petra Bevandić, Siniša Šegvić
[pdf]
[DOI]

Rethinking Confidence Calibration for Failure Prediction
Fei Zhu, Zhen Cheng, Xu-Yao Zhang, Cheng-Lin Liu
[pdf]
[DOI]

Uncertainty-Guided Source-Free Domain Adaptation
Subhankar Roy, Martin Trapp, Andrea Pilzer, Juho Kannala, Nicu Sebe, Elisa Ricci, Arno Solin
[pdf]
[DOI]

Should All Proposals Be Treated Equally in Object Detection?
Yunsheng Li, Yinpeng Chen, Xiyang Dai, Dongdong Chen, Mengchen Liu, Pei Yu, Ying Jin, Lu Yuan, Zicheng Liu, Nuno Vasconcelos
[pdf]
[DOI]

VIP: Unified Certified Detection and Recovery for Patch Attack with Vision Transformers
Junbo Li, Huan Zhang, Cihang Xie
[pdf]
[DOI]

incDFM: Incremental Deep Feature Modeling for Continual Novelty Detection
Amanda Rios, Nilesh Ahuja, Ibrahima Ndiour, Utku Genc, Laurent Itti, Omesh Tickoo
[pdf]
[DOI]

IGFormer: Interaction Graph Transformer for Skeleton-Based Human Interaction Recognition
Yunsheng Pang, Qiuhong Ke, Hossein Rahmani, James Bailey, Jun Liu
[pdf]
[DOI]

PRIME: A Few Primitives Can Boost Robustness to Common Corruptions
Apostolos Modas, Rahul Rade, Guillermo Ortiz-Jiménez, Seyed-Mohsen Moosavi-Dezfooli, Pascal Frossard
[pdf]
[DOI]

Rotation Regularization without Rotation
Takumi Kobayashi
[pdf]
[DOI]

Towards Accurate Open-Set Recognition via Background-Class Regularization
Wonwoo Cho, Jaegul Choo
[pdf]
[DOI]

In Defense of Image Pre-training for Spatiotemporal Recognition
Xianhang Li, Huiyu Wang, Chen Wei, Jieru Mei, Alan Yuille, Yuyin Zhou, Cihang Xie
[pdf]
[DOI]

Augmenting Deep Classifiers with Polynomial Neural Networks
Grigorios G. Chrysos, Markos Georgopoulos, Jiankang Deng, Jean Kossaifi, Yannis Panagakis, Anima Anandkumar
[pdf]
[DOI]

Learning with Noisy Labels by Efficient Transition Matrix Estimation to Combat Label Miscorrection
Seong Min Kye, Kwanghee Choi, Joonyoung Yi, Buru Chang
[pdf]
[DOI]

Online Task-Free Continual Learning with Dynamic Sparse Distributed Memory
Julien Pourcel, Ngoc-Son Vu, Robert M. French
[pdf]
[DOI]

Contrastive Deep Supervision
Linfeng Zhang, Xin Chen, Junbo Zhang, Runpei Dong, Kaisheng Ma
[pdf]
[DOI]

Discriminability-Transferability Trade-Off: An Information-Theoretic Perspective
Quan Cui, Bingchen Zhao, Zhao-Min Chen, Borui Zhao, Renjie Song, Boyan Zhou, Jiajun Liang, Osamu Yoshie
[pdf]
[DOI]

LocVTP: Video-Text Pre-training for Temporal Localization
Meng Cao, Tianyu Yang, Junwu Weng, Can Zhang, Jue Wang, Yuexian Zou
[pdf]
[DOI]

Few-Shot End-to-End Object Detection via Constantly Concentrated Encoding across Heads
Jiawei Ma, Guangxing Han, Shiyuan Huang, Yuncong Yang, Shih-Fu Chang
[pdf]
[DOI]

Implicit Neural Representations for Image Compression
Yannick Strümpler, Janis Postels, Ren Yang, Luc Van Gool, Federico Tombari
[pdf]
[DOI]

LiP-Flow: Learning Inference-Time Priors for Codec Avatars via Normalizing Flows in Latent Space
Emre Aksan, Shugao Ma, Akin Caliskan, Stanislav Pidhorskyi, Alexander Richard, Shih-En Wei, Jason Saragih, Otmar Hilliges
[pdf]
[DOI]

Learning to Drive by Watching YouTube Videos: Action-Conditioned Contrastive Policy Pretraining
Qihang Zhang, Zhenghao Peng, Bolei Zhou
[pdf]
[DOI]

Learning Ego 3D Representation As Ray Tracing
Jiachen Lu, Zheyuan Zhou, Xiatian Zhu, Hang Xu, Li Zhang
[pdf]
[DOI]

Static and Dynamic Concepts for Self-Supervised Video Representation Learning
Rui Qian, Shuangrui Ding, Xian Liu, Dahua Lin
[pdf]
[DOI]

SphereFed: Hyperspherical Federated Learning
Xin Dong, Sai Qian Zhang, Ang Li, H.T. Kung
[pdf]
[DOI]

Hierarchically Self-Supervised Transformer for Human Skeleton Representation Learning
Yuxiao Chen, Long Zhao, Jianbo Yuan, Yu Tian, Zhaoyang Xia, Shijie Geng, Ligong Han, Dimitris N. Metaxas
[pdf]
[DOI]

Posterior Refinement on Metric Matrix Improves Generalization Bound in Metric Learning
Mingda Wang, Canqian Yang, Yi Xu
[pdf]
[DOI]

Balancing Stability and Plasticity through Advanced Null Space in Continual Learning
Yajing Kong, Liu Liu, Zhen Wang, Dacheng Tao
[pdf]
[DOI]

DisCo: Remedying Self-Supervised Learning on Lightweight Models with Distilled Contrastive Learning
Yuting Gao, Jia-Xin Zhuang, Shaohui Lin, Hao Cheng, Xing Sun, Ke Li, Chunhua Shen
[pdf]
[DOI]

CoSCL: Cooperation of Small Continual Learners Is Stronger than a Big One
Liyuan Wang, Xingxing Zhang, Qian Li, Jun Zhu, Yi Zhong
[pdf]
[DOI]

Manifold Adversarial Learning for Cross-Domain 3D Shape Representation
Hao Huang, Cheng Chen, Yi Fang
[pdf]
[DOI]

Fast-MoCo: Boost Momentum-Based Contrastive Learning with Combinatorial Patches
Yuanzheng Ci, Chen Lin, Lei Bai, Wanli Ouyang
[pdf]
[DOI]

LoRD: Local 4D Implicit Representation for High-Fidelity Dynamic Human Modeling
Boyan Jiang, Xinlin Ren, Mingsong Dou, Xiangyang Xue, Yanwei Fu, Yinda Zhang
[pdf]
[DOI]

On the Versatile Uses of Partial Distance Correlation in Deep Learning
Xingjian Zhen, Zihang Meng, Rudrasis Chakraborty, Vikas Singh
[pdf]
[DOI]

Self-Regulated Feature Learning via Teacher-Free Feature Distillation
Lujun Li
[pdf]
[DOI]

Balancing between Forgetting and Acquisition in Incremental Subpopulation Learning
Mingfu Liang, Jiahuan Zhou, Wei Wei, Ying Wu
[pdf]
[DOI]

Counterfactual Intervention Feature Transfer for Visible-Infrared Person Re-identification
Xulin Li, Yan Lu, Bin Liu, Yating Liu, Guojun Yin, Qi Chu, Jinyang Huang, Feng Zhu, Rui Zhao, Nenghai Yu
[pdf]
[DOI]

DAS: Densely-Anchored Sampling for Deep Metric Learning
Lizhao Liu, Shangxin Huang, Zhuangwei Zhuang, Ran Yang, Mingkui Tan, Yaowei Wang
[pdf]
[DOI]

Learn from All: Erasing Attention Consistency for Noisy Label Facial Expression Recognition
Yuhang Zhang, Chengrui Wang, Xu Ling, Weihong Deng
[pdf]
[DOI]

A Non-Isotropic Probabilistic Take On Proxy-Based Deep Metric Learning
Michael Kirchhof, Karsten Roth, Zeynep Akata, Enkelejda Kasneci
[pdf]
[DOI]

TokenMix: Rethinking Image Mixing for Data Augmentation in Vision Transformers
Jihao Liu, Boxiao Liu, Hang Zhou, Hongsheng Li, Yu Liu
[pdf]
[DOI]

UFO: Unified Feature Optimization
Teng Xi, Yifan Sun, Deli Yu, Bi Li, Nan Peng, Gang Zhang, Xinyu Zhang, Zhigang Wang, Jinwen Chen, Jian Wang, Lufei Liu, Haocheng Feng, Junyu Han, Jingtuo Liu, Errui Ding, Jingdong Wang
[pdf]
[DOI]

Sound Localization by Self-Supervised Time Delay Estimation
Ziyang Chen, David F. Fouhey, Andrew Owens
[pdf]
[DOI]

X-Learner: Learning Cross Sources and Tasks for Universal Visual Representation
Yinan He, Gengshi Huang, Siyu Chen, Jianing Teng, Kun Wang, Zhenfei Yin, Lu Sheng, Ziwei Liu, Yu Qiao, Jing Shao
[pdf]
[DOI]

SLIP: Self-Supervision Meets Language-Image Pre-training
Norman Mu, Alexander Kirillov, David Wagner, Saining Xie
[pdf]
[DOI]

Discovering Deformable Keypoint Pyramids
Jianing Qian, Anastasios Panagopoulos, Dinesh Jayaraman
[pdf]
[DOI]

Neural Video Compression Using GANs for Detail Synthesis and Propagation
Fabian Mentzer, Eirikur Agustsson, Johannes Ballé, David Minnen, Nick Johnston, George Toderici
[pdf]
[DOI]

A Contrastive Objective for Learning Disentangled Representations
Jonathan Kahana, Yedid Hoshen
[pdf]
[DOI]

PT4AL: Using Self-Supervised Pretext Tasks for Active Learning
John Seon Keun Yi, Minseok Seo, Jongchan Park, Dong-Geol Choi
[pdf]
[DOI]

ParC-Net: Position Aware Circular Convolution with Merits from ConvNets and Transformer
Haokui Zhang, Wenze Hu, Xiaoyu Wang
[pdf]
[DOI]

DualPrompt: Complementary Prompting for Rehearsal-Free Continual Learning
Zifeng Wang, Zizhao Zhang, Sayna Ebrahimi, Ruoxi Sun, Han Zhang, Chen-Yu Lee, Xiaoqi Ren, Guolong Su, Vincent Perot, Jennifer Dy, Tomas Pfister
[pdf]
[DOI]

Unifying Visual Contrastive Learning for Object Recognition from a Graph Perspective
Shixiang Tang, Feng Zhu, Lei Bai, Rui Zhao, Chenyu Wang, Wanli Ouyang
[pdf]
[DOI]

Decoupled Contrastive Learning
Chun-Hsiao Yeh, Cheng-Yao Hong, Yen-Chi Hsu, Tyng-Luh Liu, Yubei Chen, Yann LeCun
[pdf]
[DOI]

Joint Learning of Localized Representations from Medical Images and Reports
Philip Müller, Georgios Kaissis, Congyu Zou, Daniel Rueckert
[pdf]
[DOI]

The Challenges of Continuous Self-Supervised Learning
Senthil Purushwalkam, Pedro Morgado, Abhinav Gupta
[pdf]
[DOI]

Conditional Stroke Recovery for Fine-Grained Sketch-Based Image Retrieval
Zhixin Ling, Zhen Xing, Jian Zhou, Xiangdong Zhou
[pdf]
[DOI]

Identifying Hard Noise in Long-Tailed Sample Distribution
Xuanyu Yi, Kaihua Tang, Xian-Sheng Hua, Joo-Hwee Lim, Hanwang Zhang
[pdf]
[DOI]

Relative Contrastive Loss for Unsupervised Representation Learning
Shixiang Tang, Feng Zhu, Lei Bai, Rui Zhao, Wanli Ouyang
[pdf]
[DOI]

Fine-Grained Fashion Representation Learning by Online Deep Clustering
Yang Jiao, Ning Xie, Yan Gao, Chien-chih Wang, Yi Sun
[pdf]
[DOI]

NashAE: Disentangling Representations through Adversarial Covariance Minimization
Eric Yeats, Frank Liu, David Womble, Hai Li
[pdf]
[DOI]

A Gyrovector Space Approach for Symmetric Positive Semi-Definite Matrix Learning
Xuan Son Nguyen
[pdf]
[DOI]

Learning Visual Representation from Modality-Shared Contrastive Language-Image Pre-training
Haoxuan You, Luowei Zhou, Bin Xiao, Noel Codella, Yu Cheng, Ruochen Xu, Shih-Fu Chang, Lu Yuan
[pdf]
[DOI]

Contrasting Quadratic Assignments for Set-Based Representation Learning
Artem Moskalev, Ivan Sosnovik, Volker Fischer, Arnold Smeulders
[pdf]
[DOI]

Class-Incremental Learning with Cross-Space Clustering and Controlled Transfer
Arjun Ashok, K J Joseph, Vineeth N Balasubramanian
[pdf]
[DOI]

Object Discovery and Representation Networks
Olivier J. Hénaff, Skanda Koppula, Evan Shelhamer, Daniel Zoran, Andrew Jaegle, Andrew Zisserman, João Carreira, Relja Arandjelović
[pdf]
[DOI]

Trading Positional Complexity vs Deepness in Coordinate Networks
Jianqiao Zheng, Sameera Ramasinghe, Xueqian Li, Simon Lucey
[pdf]
[DOI]

MVDG: A Unified Multi-View Framework for Domain Generalization
Jian Zhang, Lei Qi, Yinghuan Shi, Yang Gao
[pdf]
[DOI]

Panoptic Scene Graph Generation
Jingkang Yang, Yi Zhe Ang, Zujin Guo, Kaiyang Zhou, Wayne Zhang, Ziwei Liu
[pdf]
[DOI]

Object-Compositional Neural Implicit Surfaces
Qianyi Wu, Xian Liu, Yuedong Chen, Kejie Li, Chuanxia Zheng, Jianfei Cai, Jianmin Zheng
[pdf]
[DOI]

RigNet: Repetitive Image Guided Network for Depth Completion
Zhiqiang Yan, Kun Wang, Xiang Li, Zhenyu Zhang, Jun Li, Jian Yang
[pdf]
[DOI]

FADE: Fusing the Assets of Decoder and Encoder for Task-Agnostic Upsampling
Hao Lu, Wenze Liu, Hongtao Fu, Zhiguo Cao
[pdf]
[DOI]

LiDAL: Inter-Frame Uncertainty Based Active Learning for 3D LiDAR Semantic Segmentation
Zeyu Hu, Xuyang Bai, Runze Zhang, Xin Wang, Guangyuan Sun, Hongbo Fu, Chiew-Lan Tai
[pdf]
[DOI]

Hierarchical Memory Learning for Fine-Grained Scene Graph Generation
Youming Deng, Yansheng Li, Yongjun Zhang, Xiang Xiang, Jian Wang, Jingdong Chen, Jiayi Ma
[pdf]
[DOI]

DODA: Data-Oriented Sim-to-Real Domain Adaptation for 3D Semantic Segmentation
Runyu Ding, Jihan Yang, Li Jiang, Xiaojuan Qi
[pdf]
[DOI]

MTFormer: Multi-task Learning via Transformer and Cross-Task Reasoning
Xiaogang Xu, Hengshuang Zhao, Vibhav Vineet, Ser-Nam Lim, Antonio Torralba
[pdf]
[DOI]

MonoPLFlowNet: Permutohedral Lattice FlowNet for Real-Scale 3D Scene Flow Estimation with Monocular Images
Runfa Li, Truong Nguyen
[pdf]
[DOI]

TO-Scene: A Large-Scale Dataset for Understanding 3D Tabletop Scenes
Mutian Xu, Pei Chen, Haolin Liu, Xiaoguang Han
[pdf]
[DOI]

Is It Necessary to Transfer Temporal Knowledge for Domain Adaptive Video Semantic Segmentation?
Xinyi Wu, Zhenyao Wu, Jin Wan, Lili Ju, Song Wang
[pdf]
[DOI]

Meta Spatio-Temporal Debiasing for Video Scene Graph Generation
Li Xu, Haoxuan Qu, Jason Kuen, Jiuxiang Gu, Jun Liu
[pdf]
[DOI]

Improving the Reliability for Confidence Estimation
Haoxuan Qu, Yanchao Li, Lin Geng Foo, Jason Kuen, Jiuxiang Gu, Jun Liu
[pdf]
[DOI]

Fine-Grained Scene Graph Generation with Data Transfer
Ao Zhang, Yuan Yao, Qianyu Chen, Wei Ji, Zhiyuan Liu, Maosong Sun, Tat-Seng Chua
[pdf]
[DOI]

Pose2Room: Understanding 3D Scenes from Human Activities
Yinyu Nie, Angela Dai, Xiaoguang Han, Matthias Nießner
[pdf]
[DOI]

Towards Hard-Positive Query Mining for DETR-Based Human-Object Interaction Detection
Xubin Zhong, Changxing Ding, Zijian Li, Shaoli Huang
[pdf]
[DOI]

Discovering Human-Object Interaction Concepts via Self-Compositional Learning
Zhi Hou, Baosheng Yu, Dacheng Tao
[pdf]
[DOI]

Primitive-Based Shape Abstraction via Nonparametric Bayesian Inference
Yuwei Wu, Weixiao Liu, Sipu Ruan, Gregory S. Chirikjian
[pdf]
[DOI]

Stereo Depth Estimation with Echoes
Chenghao Zhang, Kun Tian, Bolin Ni, Gaofeng Meng, Bin Fan, Zhaoxiang Zhang, Chunhong Pan
[pdf]
[DOI]

Inverted Pyramid Multi-task Transformer for Dense Scene Understanding
Hanrong Ye, Dan Xu
[pdf]
[DOI]

PETR: Position Embedding Transformation for Multi-View 3D Object Detection
Yingfei Liu, Tiancai Wang, Xiangyu Zhang, Jian Sun
[pdf]
[DOI]

S2Net: Stochastic Sequential Pointcloud Forecasting
Xinshuo Weng, Junyu Nan, Kuan-Hui Lee, Rowan McAllister, Adrien Gaidon, Nicholas Rhinehart, Kris M. Kitani
[pdf]
[DOI]

RA-Depth: Resolution Adaptive Self-Supervised Monocular Depth Estimation
Mu He, Le Hui, Yikai Bian, Jian Ren, Jin Xie, Jian Yang
[pdf]
[DOI]

PolyphonicFormer: Unified Query Learning for Depth-Aware Video Panoptic Segmentation
Haobo Yuan, Xiangtai Li, Yibo Yang, Guangliang Cheng, Jing Zhang, Yunhai Tong, Lefei Zhang, Dacheng Tao
[pdf]
[DOI]

SQN: Weakly-Supervised Semantic Segmentation of Large-Scale 3D Point Clouds
Qingyong Hu, Bo Yang, Guangchi Fang, Yulan Guo, Aleš Leonardis, Niki Trigoni, Andrew Markham
[pdf]
[DOI]

PointMixer: MLP-Mixer for Point Cloud Understanding
Jaesung Choe, Chunghyun Park, Francois Rameau, Jaesik Park, In So Kweon
[pdf]
[DOI]

Initialization and Alignment for Adversarial Texture Optimization
Xiaoming Zhao, Zhizhen Zhao, Alexander G. Schwing
[pdf]
[DOI]

MOTR: End-to-End Multiple-Object Tracking with TRansformer
Fangao Zeng, Bin Dong, Yuang Zhang, Tiancai Wang, Xiangyu Zhang, Yichen Wei
[pdf]
[DOI]

GALA: Toward Geometry-and-Lighting-Aware Object Search for Compositing
Sijie Zhu, Zhe Lin, Scott Cohen, Jason Kuen, Zhifei Zhang, Chen Chen
[pdf]
[DOI]

LaLaLoc++: Global Floor Plan Comprehension for Layout Localisation in Unvisited Environments
Henry Howard-Jenkins, Victor Adrian Prisacariu
[pdf]
[DOI]

3D-PL: Domain Adaptive Depth Estimation with 3D-Aware Pseudo-Labeling
Yu-Ting Yen, Chia-Ni Lu, Wei-Chen Chiu, Yi-Hsuan Tsai
[pdf]
[DOI]

Panoptic-PartFormer: Learning a Unified Model for Panoptic Part Segmentation
Xiangtai Li, Shilin Xu, Yibo Yang, Guangliang Cheng, Yunhai Tong, Dacheng Tao
[pdf]
[DOI]

Salient Object Detection for Point Clouds
Songlin Fan, Wei Gao, Ge Li
[pdf]
[DOI]

Learning Semantic Segmentation from Multiple Datasets with Label Shifts
Dongwan Kim, Yi-Hsuan Tsai, Yumin Suh, Masoud Faraki, Sparsh Garg, Manmohan Chandraker, Bohyung Han
[pdf]
[DOI]

Weakly Supervised 3D Scene Segmentation with Region-Level Boundary Awareness and Instance Discrimination
Kangcheng Liu, Yuzhi Zhao, Qiang Nie, Zhi Gao, Ben M. Chen
[pdf]
[DOI]

Towards Open-Vocabulary Scene Graph Generation with Prompt-Based Finetuning
Tao He, Lianli Gao, Jingkuan Song, Yuan-Fang Li
[pdf]
[DOI]

Variance-Aware Weight Initialization for Point Convolutional Neural Networks
Pedro Hermosilla, Michael Schelling, Tobias Ritschel, Timo Ropinski
[pdf]
[DOI]

Break and Make: Interactive Structural Understanding Using LEGO Bricks
Aaron Walsman, Muru Zhang, Klemen Kotar, Karthik Desingh, Ali Farhadi, Dieter Fox
[pdf]
[DOI]

Bi-PointFlowNet: Bidirectional Learning for Point Cloud Based Scene Flow Estimation
Wencan Cheng, Jong Hwan Ko
[pdf]
[DOI]

3DG-STFM: 3D Geometric Guided Student-Teacher Feature Matching
Runyu Mao, Chen Bai, Yatong An, Fengqing Zhu, Cheng Lu
[pdf]
[DOI]

Video Restoration Framework and Its Meta-Adaptations to Data-Poor Conditions
Prashant W Patil, Sunil Gupta, Santu Rana, Svetha Venkatesh
[pdf]
[DOI]

MonteBoxFinder: Detecting and Filtering Primitives to Fit a Noisy Point Cloud
Michaël Ramamonjisoa, Sinisa Stekovic, Vincent Lepetit
[pdf]
[DOI]

Scene Text Recognition with Permuted Autoregressive Sequence Models
Darwin Bautista, Rowel Atienza
[pdf]
[DOI]

When Counting Meets HMER: Counting-Aware Network for Handwritten Mathematical Expression Recognition
Bohan Li, Ye Yuan, Dingkang Liang, Xiao Liu, Zhilong Ji, Jinfeng Bai, Wenyu Liu, Xiang Bai
[pdf]
[DOI]

Detecting Tampered Scene Text in the Wild
Yuxin Wang, Hongtao Xie, Mengting Xing, Jing Wang, Shenggao Zhu, Yongdong Zhang
[pdf]
[DOI]

Optimal Boxes: Boosting End-to-End Scene Text Recognition by Adjusting Annotated Bounding Boxes via Reinforcement Learning
Jingqun Tang, Wenming Qian, Luchuan Song, Xiena Dong, Lan Li, Xiang Bai
[pdf]
[DOI]

GLASS: Global to Local Attention for Scene-Text Spotting
Roi Ronen, Shahar Tsiper, Oron Anschel, Inbal Lavi, Amir Markovitz, R. Manmatha
[pdf]
[DOI]

COO: Comic Onomatopoeia Dataset for Recognizing Arbitrary or Truncated Texts
Jeonghun Baek, Yusuke Matsui, Kiyoharu Aizawa
[pdf]
[DOI]

Language Matters: A Weakly Supervised Vision-Language Pre-training Approach for Scene Text Detection and Spotting
Chuhui Xue, Wenqing Zhang, Yu Hao, Shijian Lu, Philip H. S. Torr, Song Bai
[pdf]
[DOI]

Toward Understanding WordArt: Corner-Guided Transformer for Scene Text Recognition
Xudong Xie, Ling Fu, Zhifei Zhang, Zhaowen Wang, Xiang Bai
[pdf]
[DOI]

Levenshtein OCR
Cheng Da, Peng Wang, Cong Yao
[pdf]
[DOI]

Multi-Granularity Prediction for Scene Text Recognition
Peng Wang, Cheng Da, Cong Yao
[pdf]
[DOI]

Dynamic Low-Resolution Distillation for Cost-Efficient End-to-End Text Spotting
Ying Chen, Liang Qiao, Zhanzhan Cheng, Shiliang Pu, Yi Niu, Xi Li
[pdf]
[DOI]

Contextual Text Block Detection towards Scene Text Understanding
Chuhui Xue, Jiaxing Huang, Wenqing Zhang, Shijian Lu, Changhu Wang, Song Bai
[pdf]
[DOI]

CoMER: Modeling Coverage for Transformer-Based Handwritten Mathematical Expression Recognition
Wenqi Zhao, Liangcai Gao
[pdf]
[DOI]

Don’t Forget Me: Accurate Background Recovery for Text Removal via Modeling Local-Global Context
Chongyu Liu, Lianwen Jin, Yuliang Liu, Canjie Luo, Bangdong Chen, Fengjun Guo, Kai Ding
[pdf]
[DOI]

TextAdaIN: Paying Attention to Shortcut Learning in Text Recognizers
Oren Nuriel, Sharon Fogel, Ron Litman
[pdf]
[DOI]

Multi-modal Text Recognition Networks: Interactive Enhancements between Visual and Semantic Features
Byeonghu Na, Yoonsik Kim, Sungrae Park
[pdf]
[DOI]

SGBANet: Semantic GAN and Balanced Attention Network for Arbitrarily Oriented Scene Text Recognition
Dajian Zhong, Shujing Lyu, Palaiahnakote Shivakumara, Bing Yin, Jiajia Wu, Umapada Pal, Yue Lu
[pdf]
[DOI]

Pure Transformer with Integrated Experts for Scene Text Recognition
Yew Lee Tan, Adams Wai-Kin Kong, Jung-Jae Kim
[pdf]
[DOI]

OCR-Free Document Understanding Transformer
Geewook Kim, Teakgyu Hong, Moonbin Yim, JeongYeon Nam, Jinyoung Park, Jinyeong Yim, Wonseok Hwang, Sangdoo Yun, Dongyoon Han, Seunghyun Park
[pdf]
[DOI]

CAR: Class-Aware Regularizations for Semantic Segmentation
Ye Huang, Di Kang, Liang Chen, Xuefei Zhe, Wenjing Jia, Linchao Bao, Xiangjian He
[pdf]
[DOI]

Style-Hallucinated Dual Consistency Learning for Domain Generalized Semantic Segmentation
Yuyang Zhao, Zhun Zhong, Na Zhao, Nicu Sebe, Gim Hee Lee
[pdf]
[DOI]

SeqFormer: Sequential Transformer for Video Instance Segmentation
Junfeng Wu, Yi Jiang, Song Bai, Wenqing Zhang, Xiang Bai
[pdf]
[DOI]

Saliency Hierarchy Modeling via Generative Kernels for Salient Object Detection
Wenhu Zhang, Liangli Zheng, Huanyu Wang, Xintian Wu, Xi Li
[pdf]
[DOI]

In Defense of Online Models for Video Instance Segmentation
Junfeng Wu, Qihao Liu, Yi Jiang, Song Bai, Alan Yuille, Xiang Bai
[pdf]
[DOI]

Active Pointly-Supervised Instance Segmentation
Chufeng Tang, Lingxi Xie, Gang Zhang, Xiaopeng Zhang, Qi Tian, Xiaolin Hu
[pdf]
[DOI]

A Transformer-Based Decoder for Semantic Segmentation with Multi-level Context Mining
Bowen Shi, Dongsheng Jiang, Xiaopeng Zhang, Han Li, Wenrui Dai, Junni Zou, Hongkai Xiong, Qi Tian
[pdf]
[DOI]

XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model
Ho Kei Cheng, Alexander G. Schwing
[pdf]
[DOI]

Self-Distillation for Robust LiDAR Semantic Segmentation in Autonomous Driving
Jiale Li, Hang Dai, Yong Ding
[pdf]
[DOI]

2DPASS: 2D Priors Assisted Semantic Segmentation on LiDAR Point Clouds
Xu Yan, Jiantao Gao, Chaoda Zheng, Chao Zheng, Ruimao Zhang, Shuguang Cui, Zhen Li
[pdf]
[DOI]

Extract Free Dense Labels from CLIP
Chong Zhou, Chen Change Loy, Bo Dai
[pdf]
[DOI]

3D Compositional Zero-Shot Learning with DeCompositional Consensus
Muhammad Ferjad Naeem, Evin Pınar Örnek, Yongqin Xian, Luc Van Gool, Federico Tombari
[pdf]
[DOI]

Video Mask Transfiner for High-Quality Video Instance Segmentation
Lei Ke, Henghui Ding, Martin Danelljan, Yu-Wing Tai, Chi-Keung Tang, Fisher Yu
[pdf]
[DOI]

Box-Supervised Instance Segmentation with Level Set Evolution
Wentong Li, Wenyu Liu, Jianke Zhu, Miaomiao Cui, Xian-Sheng Hua, Lei Zhang
[pdf]
[DOI]

Point Primitive Transformer for Long-Term 4D Point Cloud Video Understanding
Hao Wen, Yunze Liu, Jingwei Huang, Bo Duan, Li Yi
[pdf]
[DOI]

Adaptive Agent Transformer for Few-Shot Segmentation
Yuan Wang, Rui Sun, Zhe Zhang, Tianzhu Zhang
[pdf]
[DOI]

Waymo Open Dataset: Panoramic Video Panoptic Segmentation
Jieru Mei, Alex Zihao Zhu, Xinchen Yan, Hang Yan, Siyuan Qiao, Yukun Zhu, Liang-Chieh Chen, Henrik Kretzschmar
[pdf]
[DOI]

TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic Segmentation
Zhaoyuan Yin, Pichao Wang, Fan Wang, Xianzhe Xu, Hanling Zhang, Hao Li, Rong Jin
[pdf]
[DOI]

AdaAfford: Learning to Adapt Manipulation Affordance for 3D Articulated Objects via Few-Shot Interactions
Yian Wang, Ruihai Wu, Kaichun Mo, Jiaqi Ke, Qingnan Fan, Leonidas J. Guibas, Hao Dong
[pdf]
[DOI]

Cost Aggregation with 4D Convolutional Swin Transformer for Few-Shot Segmentation
Sunghwan Hong, Seokju Cho, Jisu Nam, Stephen Lin, Seungryong Kim
[pdf]
[DOI]

"Fine-Grained Egocentric Hand-Object Segmentation: Dataset, Model, and Applications"
Lingzhi Zhang, Shenghao Zhou, Simon Stent, Jianbo Shi
[pdf]
[DOI]

Perceptual Artifacts Localization for Inpainting
Lingzhi Zhang, Yuqian Zhou, Connelly Barnes, Sohrab Amirghodsi, Zhe Lin, Eli Shechtman, Jianbo Shi
[pdf]
[DOI]

2D Amodal Instance Segmentation Guided by 3D Shape Prior
Zhixuan Li, Weining Ye, Tingting Jiang, Tiejun Huang
[pdf]
[DOI]

Data Efficient 3D Learner via Knowledge Transferred from 2D Model
Ping-Chung Yu, Cheng Sun, Min Sun
[pdf]
[DOI]

Adaptive Spatial-BCE Loss for Weakly Supervised Semantic Segmentation
Tong Wu, Guangyu Gao, Junshi Huang, Xiaolin Wei, Xiaoming Wei, Chi Harold Liu
[pdf]
[DOI]

Dense Gaussian Processes for Few-Shot Segmentation
Joakim Johnander, Johan Edstedt, Michael Felsberg, Fahad Shahbaz Khan, Martin Danelljan
[pdf]
[DOI]

3D Instances as 1D Kernels
Yizheng Wu, Min Shi, Shuaiyuan Du, Hao Lu, Zhiguo Cao, Weicai Zhong
[pdf]
[DOI]

TransMatting: Enhancing Transparent Objects Matting with Transformers
Huanqia Cai, Fanglei Xue, Lele Xu, Lili Guo
[pdf]
[DOI]

MVSalNet:Multi-View Augmentation for RGB-D Salient Object Detection
Jiayuan Zhou, Lijun Wang, Huchuan Lu, Kaining Huang, Xinchu Shi, Bocong Liu
[pdf]
[DOI]

k-Means Mask Transformer
Qihang Yu, Huiyu Wang, Siyuan Qiao, Maxwell Collins, Yukun Zhu, Hartwig Adam, Alan Yuille, Liang-Chieh Chen
[pdf]
[DOI]

SegPGD: An Effective and Efficient Adversarial Attack for Evaluating and Boosting Segmentation Robustness
Jindong Gu, Hengshuang Zhao, Volker Tresp, Philip H. S. Torr
[pdf]
[DOI]

Adversarial Erasing Framework via Triplet with Gated Pyramid Pooling Layer for Weakly Supervised Semantic Segmentation
Sung-Hoon Yoon, Hyeokjun Kweon, Jegyeong Cho, Shinjeong Kim, Kuk-Jin Yoon
[pdf]
[DOI]

Continual Semantic Segmentation via Structure Preserving and Projected Feature Alignment
Zihan Lin, Zilei Wang, Yixin Zhang
[pdf]
[DOI]

Interclass Prototype Relation for Few-Shot Segmentation
Atsuro Okazawa
[pdf]
[DOI]

Slim Scissors: Segmenting Thin Object from Synthetic Background
Kunyang Han, Jun Hao Liew, Jiashi Feng, Huawei Tian, Yao Zhao, Yunchao Wei
[pdf]
[DOI]

Abstracting Sketches through Simple Primitives
Stephan Alaniz, Massimiliano Mancini, Anjan Dutta, Diego Marcos, Zeynep Akata
[pdf]
[DOI]

Multi-Scale and Cross-Scale Contrastive Learning for Semantic Segmentation
Theodoros Pissas, Claudio S. Ravasio, Lyndon Da Cruz, Christos Bergeles
[pdf]
[DOI]

One-Trimap Video Matting
Hongje Seong, Seoung Wug Oh, Brian Price, Euntai Kim, Joon-Young Lee
[pdf]
[DOI]

D2ADA: Dynamic Density-Aware Active Domain Adaptation for Semantic Segmentation
Tsung-Han Wu, Yi-Syuan Liou, Shao-Ji Yuan, Hsin-Ying Lee, Tung-I Chen, Kuan-Chih Huang, Winston H. Hsu
[pdf]
[DOI]

Learning Quality-Aware Dynamic Memory for Video Object Segmentation
Yong Liu, Ran Yu, Fei Yin, Xinyuan Zhao, Wei Zhao, Weihao Xia, Yujiu Yang
[pdf]
[DOI]

Learning Implicit Feature Alignment Function for Semantic Segmentation
Hanzhe Hu, Yinbo Chen, Jiarui Xu, Shubhankar Borse, Hong Cai, Fatih Porikli, Xiaolong Wang
[pdf]
[DOI]

Quantum Motion Segmentation
Federica Arrigoni, Willi Menapace, Marcel Seelbach Benkner, Elisa Ricci, Vladislav Golyanik
[pdf]
[DOI]

Instance As Identity: A Generic Online Paradigm for Video Instance Segmentation
Feng Zhu, Zongxin Yang, Xin Yu, Yi Yang, Yunchao Wei
[pdf]
[DOI]

Laplacian Mesh Transformer: Dual Attention and Topology Aware Network for 3D Mesh Classification and Segmentation
Xiao-Juan Li, Jie Yang, Fang-Lue Zhang
[pdf]
[DOI]

Geodesic-Former: A Geodesic-Guided Few-Shot 3D Point Cloud Instance Segmenter
Tuan Ngo, Khoi Nguyen
[pdf]
[DOI]

Union-Set Multi-source Model Adaptation for Semantic Segmentation
Zongyao Li, Ren Togo, Takahiro Ogawa, Miki Haseyama
[pdf]
[DOI]

Point MixSwap: Attentional Point Cloud Mixing via Swapping Matched Structural Divisions
Ardian Umam, Cheng-Kun Yang, Yung-Yu Chuang, Jen-Hui Chuang, Yen-Yu Lin
[pdf]
[DOI]

BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object Segmentation
Ye Yu, Jialin Yuan, Gaurav Mittal, Li Fuxin, Mei Chen
[pdf]
[DOI]

SPSN: Superpixel Prototype Sampling Network for RGB-D Salient Object Detection
Minhyeok Lee, Chaewon Park, Suhwan Cho, Sangyoun Lee
[pdf]
[DOI]

Global Spectral Filter Memory Network for Video Object Segmentation
Yong Liu, Ran Yu, Jiahao Wang, Xinyuan Zhao, Yitong Wang, Yansong Tang, Yujiu Yang
[pdf]
[DOI]

Video Instance Segmentation via Multi-Scale Spatio-Temporal Split Attention Transformer
Omkar Thawakar, Sanath Narayan, Jiale Cao, Hisham Cholakkal, Rao Muhammad Anwer, Muhammad Haris Khan, Salman Khan, Michael Felsberg, Fahad Shahbaz Khan
[pdf]
[DOI]

RankSeg: Adaptive Pixel Classification with Image Category Ranking for Segmentation
Haodi He, Yuhui Yuan, Xiangyu Yue, Han Hu
[pdf]
[DOI]

Learning Topological Interactions for Multi-Class Medical Image Segmentation
Saumya Gupta, Xiaoling Hu, James Kaan, Michael Jin, Mutshipay Mpoy, Katherine Chung, Gagandeep Singh, Mary Saltz, Tahsin Kurc, Joel Saltz, Apostolos Tassiopoulos, Prateek Prasanna, Chao Chen
[pdf]
[DOI]

Unsupervised Segmentation in Real-World Images via Spelke Object Inference
Honglin Chen, Rahul Venkatesh, Yoni Friedman, Jiajun Wu, Joshua B. Tenenbaum, Daniel L. K. Yamins, Daniel M. Bear
[pdf]
[DOI]

A Simple Baseline for Open-Vocabulary Semantic Segmentation with Pre-trained Vision-Language Model
Mengde Xu, Zheng Zhang, Fangyun Wei, Yutong Lin, Yue Cao, Han Hu, Xiang Bai
[pdf]
[DOI]

Fast Two-View Motion Segmentation Using Christoffel Polynomials
Bengisu Ozbay, Octavia Camps, Mario Sznaier
[pdf]
[DOI]

UCTNet: Uncertainty-Aware Cross-Modal Transformer Network for Indoor RGB-D Semantic Segmentation
Xiaowen Ying, Mooi Choo Chuah
[pdf]
[DOI]

Bi-directional Contrastive Learning for Domain Adaptive Semantic Segmentation
Geon Lee, Chanho Eom, Wonkyung Lee, Hyekang Park, Bumsub Ham
[pdf]
[DOI]

Learning Regional Purity for Instance Segmentation on 3D Point Clouds
Shichao Dong, Guosheng Lin, Tzu-Yi Hung
[pdf]
[DOI]

Cross-Domain Few-Shot Semantic Segmentation
Shuo Lei, Xuchao Zhang, Jianfeng He, Fanglan Chen, Bowen Du, Chang-Tien Lu
[pdf]
[DOI]

Generative Subgraph Contrast for Self-Supervised Graph Representation Learning
Yuehui Han, Le Hui, Haobo Jiang, Jianjun Qian, Jin Xie
[pdf]
[DOI]

SdAE: Self-Distillated Masked Autoencoder
Yabo Chen, Yuchen Liu, Dongsheng Jiang, Xiaopeng Zhang, Wenrui Dai, Hongkai Xiong, Qi Tian
[pdf]
[DOI]

Demystifying Unsupervised Semantic Correspondence Estimation
Mehmet Aygün, Oisin Mac Aodha
[pdf]
[DOI]

Open-Set Semi-Supervised Object Detection
Yen-Cheng Liu, Chih-Yao Ma, Xiaoliang Dai, Junjiao Tian, Peter Vajda, Zijian He, Zsolt Kira
[pdf]
[DOI]

Vibration-Based Uncertainty Estimation for Learning from Limited Supervision
Hengtong Hu, Lingxi Xie, Xinyue Huo, Richang Hong, Qi Tian
[pdf]
[DOI]

Concurrent Subsidiary Supervision for Unsupervised Source-Free Domain Adaptation
Jogendra Nath Kundu, Suvaansh Bhambri, Akshay Kulkarni, Hiran Sarkar, Varun Jampani, R. Venkatesh Babu
[pdf]
[DOI]

Weakly Supervised Object Localization through Inter-class Feature Similarity and Intra-Class Appearance Consistency
Jun Wei, Sheng Wang, S. Kevin Zhou, Shuguang Cui, Zhen Li
[pdf]
[DOI]

Active Learning Strategies for Weakly-Supervised Object Detection
Huy V. Vo, Oriane Siméoni, Spyros Gidaris, Andrei Bursuc, Patrick Pérez, Jean Ponce
[pdf]
[DOI]

Mc-BEiT: Multi-Choice Discretization for Image BERT Pre-training
Xiaotong Li, Yixiao Ge, Kun Yi, Zixuan Hu, Ying Shan, Ling-Yu Duan
[pdf]
[DOI]

Bootstrapped Masked Autoencoders for Vision BERT Pretraining
Xiaoyi Dong, Jianmin Bao, Ting Zhang, Dongdong Chen, Weiming Zhang, Lu Yuan, Dong Chen, Fang Wen, Nenghai Yu
[pdf]
[DOI]

Unsupervised Visual Representation Learning by Synchronous Momentum Grouping
Bo Pang, Yifan Zhang, Yaoyi Li, Jia Cai, Cewu Lu
[pdf]
[DOI]

Improving Few-Shot Part Segmentation Using Coarse Supervision
Oindrila Saha, Zezhou Cheng, Subhransu Maji
[pdf]
[DOI]

What to Hide from Your Students: Attention-Guided Masked Image Modeling
Ioannis Kakogeorgiou, Spyros Gidaris, Bill Psomas, Yannis Avrithis, Andrei Bursuc, Konstantinos Karantzalos, Nikos Komodakis
[pdf]
[DOI]

Pointly-Supervised Panoptic Segmentation
Junsong Fan, Zhaoxiang Zhang, Tieniu Tan
[pdf]
[DOI]

MVP: Multimodality-Guided Visual Pre-training
Longhui Wei, Lingxi Xie, Wengang Zhou, Houqiang Li, Qi Tian
[pdf]
[DOI]

Locally Varying Distance Transform for Unsupervised Visual Anomaly Detection
Wen-Yan Lin, Zhonghang Liu, Siying Liu
[pdf]
[DOI]

HRDA: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation
Lukas Hoyer, Dengxin Dai, Luc Van Gool
[pdf]
[DOI]

SPot-the-Difference Self-Supervised Pre-training for Anomaly Detection and Segmentation
Yang Zou, Jongheon Jeong, Latha Pemula, Dongqing Zhang, Onkar Dabeer
[pdf]
[DOI]

Dual-Domain Self-Supervised Learning and Model Adaption for Deep Compressive Imaging
Yuhui Quan, Xinran Qin, Tongyao Pang, Hui Ji
[pdf]
[DOI]

Unsupervised Selective Labeling for More Effective Semi-Supervised Learning
Xudong Wang, Long Lian, Stella X. Yu
[pdf]
[DOI]

Max Pooling with Vision Transformers Reconciles Class and Shape in Weakly Supervised Semantic Segmentation
Simone Rossetti, Damiano Zappia, Marta Sanzari, Marco Schaerf, Fiora Pirri
[pdf]
[DOI]

Dense Siamese Network for Dense Unsupervised Learning
Wenwei Zhang, Jiangmiao Pang, Kai Chen, Chen Change Loy
[pdf]
[DOI]

Multi-Granularity Distillation Scheme towards Lightweight Semi-Supervised Semantic Segmentation
Jie Qin, Jie Wu, Ming Li, Xuefeng Xiao, Min Zheng, Xingang Wang
[pdf]
[DOI]

CP2: Copy-Paste Contrastive Pretraining for Semantic Segmentation
Feng Wang, Huiyu Wang, Chen Wei, Alan Yuille, Wei Shen
[pdf]
[DOI]

Self-Filtering: A Noise-Aware Sample Selection for Label Noise with Confidence Penalization
Qi Wei, Haoliang Sun, Xiankai Lu, Yilong Yin
[pdf]
[DOI]

RDA: Reciprocal Distribution Alignment for Robust Semi-Supervised Learning
Yue Duan, Lei Qi, Lei Wang, Luping Zhou, Yinghuan Shi
[pdf]
[DOI]

MemSAC: Memory Augmented Sample Consistency for Large Scale Domain Adaptation
Tarun Kalluri, Astuti Sharma, Manmohan Chandraker
[pdf]
[DOI]

United Defocus Blur Detection and Deblurring via Adversarial Promoting Learning
Wenda Zhao, Fei Wei, You He, Huchuan Lu
[pdf]
[DOI]

Synergistic Self-Supervised and Quantization Learning
Yun-Hao Cao, Peiqin Sun, Yechang Huang, Jianxin Wu, Shuchang Zhou
[pdf]
[DOI]

Semi-Supervised Vision Transformers
Zejia Weng, Xitong Yang, Ang Li, Zuxuan Wu, Yu-Gang Jiang
[pdf]
[DOI]

Domain Adaptive Video Segmentation via Temporal Pseudo Supervision
Yun Xing, Dayan Guan, Jiaxing Huang, Shijian Lu
[pdf]
[DOI]

Diverse Learner: Exploring Diverse Supervision for Semi-Supervised Object Detection
Linfeng Li, Minyue Jiang, Yue Yu, Wei Zhang, Xiangru Lin, Yingying Li, Xiao Tan, Jingdong Wang, Errui Ding
[pdf]
[DOI]

A Closer Look at Invariances in Self-Supervised Pre-training for 3D Vision
Lanxiao Li, Michael Heizmann
[pdf]
[DOI]

ConMatch: Semi-Supervised Learning with Confidence-Guided Consistency Regularization
Jiwon Kim, Youngjo Min, Daehwan Kim, Gyuseong Lee, Junyoung Seo, Kwangrok Ryoo, Seungryong Kim
[pdf]
[DOI]

FedX: Unsupervised Federated Learning with Cross Knowledge Distillation
Sungwon Han, Sungwon Park, Fangzhao Wu, Sundong Kim, Chuhan Wu, Xing Xie, Meeyoung Cha
[pdf]
[DOI]

W2N: Switching from Weak Supervision to Noisy Supervision for Object Detection
Zitong Huang, Yiping Bao, Bowen Dong, Erjin Zhou, Wangmeng Zuo
[pdf]
[DOI]

Decoupled Adversarial Contrastive Learning for Self-Supervised Adversarial Robustness
Chaoning Zhang, Kang Zhang, Chenshuang Zhang, Axi Niu, Jiu Feng, Chang D. Yoo, In So Kweon
[pdf]
[DOI]

GOCA: Guided Online Cluster Assignment for Self-Supervised Video Representation Learning
Huseyin Coskun, Alireza Zareian, Joshua L. Moore, Federico Tombari, Chen Wang
[pdf]
[DOI]

Constrained Mean Shift Using Distant Yet Related Neighbors for Representation Learning
K L Navaneet, Soroush Abbasi Koohpayegani, Ajinkya Tejankar, Kossar Pourahmadi, Akshayvarun Subramanya, Hamed Pirsiavash
[pdf]
[DOI]

Revisiting the Critical Factors of Augmentation-Invariant Representation Learning
Junqiang Huang, Xiangwen Kong, Xiangyu Zhang
[pdf]
[DOI]

CA-SSL: Class-Agnostic Semi-Supervised Learning for Detection and Segmentation
Lu Qi, Jason Kuen, Zhe Lin, Jiuxiang Gu, Fengyun Rao, Dian Li, Weidong Guo, Zhen Wen, Ming-Hsuan Yang, Jiaya Jia
[pdf]
[DOI]

Dual Adaptive Transformations for Weakly Supervised Point Cloud Segmentation
Zhonghua Wu, Yicheng Wu, Guosheng Lin, Jianfei Cai, Chen Qian
[pdf]
[DOI]

Semantic-Aware Fine-Grained Correspondence
Yingdong Hu, Renhao Wang, Kaifeng Zhang, Yang Gao
[pdf]
[DOI]

Self-Supervised Classification Network
Elad Amrani, Leonid Karlinsky, Alex Bronstein
[pdf]
[DOI]

Data Invariants to Understand Unsupervised Out-of-Distribution Detection
Lars Doorenbos, Raphael Sznitman, Pablo Márquez-Neila
[pdf]
[DOI]

Domain Invariant Masked Autoencoders for Self-Supervised Learning from Multi-Domains
Haiyang Yang, Shixiang Tang, Meilin Chen, Yizhou Wang, Feng Zhu, Lei Bai, Rui Zhao, Wanli Ouyang
[pdf]
[DOI]

Semi-Supervised Object Detection via Virtual Category Learning
Changrui Chen, Kurt Debattista, Jungong Han
[pdf]
[DOI]

Completely Self-Supervised Crowd Counting via Distribution Matching
Deepak Babu Sam, Abhinav Agarwalla, Jimmy Joseph, Vishwanath A. Sindagi, R. Venkatesh Babu, Vishal M. Patel
[pdf]
[DOI]

Coarse-to-Fine Incremental Few-Shot Learning
Xiang Xiang, Yuwen Tan, Qian Wan, Jing Ma, Alan Yuille, Gregory D. Hager
[pdf]
[DOI]

Learning Unbiased Transferability for Domain Adaptation by Uncertainty Modeling
Jian Hu, Haowen Zhong, Fei Yang, Shaogang Gong, Guile Wu, Junchi Yan
[pdf]
[DOI]

Learn2Augment: Learning to Composite Videos for Data Augmentation in Action Recognition
Shreyank N Gowda, Marcus Rohrbach, Frank Keller, Laura Sevilla-Lara
[pdf]
[DOI]

CYBORGS: Contrastively Bootstrapping Object Representations by Grounding in Segmentation
Renhao Wang, Hang Zhao, Yang Gao
[pdf]
[DOI]

PSS: Progressive Sample Selection for Open-World Visual Representation Learning
Tianyue Cao, Yongxin Wang, Yifan Xing, Tianjun Xiao, Tong He, Zheng Zhang, Hao Zhou, Joseph Tighe
[pdf]
[DOI]

Improving Self-Supervised Lightweight Model Learning via Hard-Aware Metric Distillation
Hao Liu, Mang Ye
[pdf]
[DOI]

Object Discovery via Contrastive Learning for Weakly Supervised Object Detection
Jinhwan Seo, Wonho Bae, Danica J. Sutherland, Junhyug Noh, Daijin Kim
[pdf]
[DOI]

Stochastic Consensus: Enhancing Semi-Supervised Learning with Consistency of Stochastic Classifiers
Hui Tang, Lin Sun, Kui Jia
[pdf]
[DOI]

DiffuseMorph: Unsupervised Deformable Image Registration Using Diffusion Model
Boah Kim, Inhwa Han, Jong Chul Ye
[pdf]
[DOI]

Semi-Leak: Membership Inference Attacks against Semi-Supervised Learning
Xinlei He, Hongbin Liu, Neil Zhenqiang Gong, Yang Zhang
[pdf]
[DOI]

OpenLDN: Learning to Discover Novel Classes for Open-World Semi-Supervised Learning
Mamshad Nayeem Rizve, Navid Kardan, Salman Khan, Fahad Shahbaz Khan, Mubarak Shah
[pdf]
[DOI]

Embedding Contrastive Unsupervised Features to Cluster in- and Out-of-Distribution Noise in Corrupted Image Datasets
Paul Albert, Eric Arazo, Noel E. O’Connor, Kevin McGuinness
[pdf]
[DOI]

Unsupervised Few-Shot Image Classification by Learning Features into Clustering Space
Shuo Li, Fang Liu, Zehua Hao, Kaibo Zhao, Licheng Jiao
[pdf]
[DOI]

Towards Realistic Semi-Supervised Learning
Mamshad Nayeem Rizve, Navid Kardan, Mubarak Shah
[pdf]
[DOI]

Masked Siamese Networks for Label-Efficient Learning
Mahmoud Assran, Mathilde Caron, Ishan Misra, Piotr Bojanowski, Florian Bordes, Pascal Vincent, Armand Joulin, Michael Rabbat, Nicolas Ballas
[pdf]
[DOI]

Natural Synthetic Anomalies for Self-Supervised Anomaly Detection and Localization
Hannah M. Schlüter, Jeremy Tan, Benjamin Hou, Bernhard Kainz
[pdf]
[DOI]

Understanding Collapse in Non-Contrastive Siamese Representation Learning
Alexander C. Li, Alexei A. Efros, Deepak Pathak
[pdf]
[DOI]

Federated Self-Supervised Learning for Video Understanding
Yasar Abbas Ur Rehman, Yan Gao, Jiajun Shen, Pedro Porto Buarque de Gusmão, Nicholas Lane
[pdf]
[DOI]

Towards Efficient and Effective Self-Supervised Learning of Visual Representations
Sravanti Addepalli, Kaushal Bhogale, Priyam Dey, R. Venkatesh Babu
[pdf]
[DOI]

DSR – A Dual Subspace Re-Projection Network for Surface Anomaly Detection
Vitjan Zavrtanik, Matej Kristan, Danijel Skočaj
[pdf]
[DOI]

PseudoAugment: Learning to Use Unlabeled Data for Data Augmentation in Point Clouds
Zhaoqi Leng, Shuyang Cheng, Benjamin Caine, Weiyue Wang, Xiao Zhang, Jonathon Shlens, Mingxing Tan, Dragomir Anguelov
[pdf]
[DOI]

MVSTER: Epipolar Transformer for Efficient Multi-View Stereo
Xiaofeng Wang, Zheng Zhu, Guan Huang, Fangbo Qin, Yun Ye, Yijia He, Xu Chi, Xingang Wang
[pdf]
[DOI]

RelPose: Predicting Probabilistic Relative Rotation for Single Objects in the Wild
Jason Y. Zhang, Deva Ramanan, Shubham Tulsiani
[pdf]
[DOI]

R2L: Distilling Neural Radiance Field to Neural Light Field for Efficient Novel View Synthesis
Huan Wang, Jian Ren, Zeng Huang, Kyle Olszewski, Menglei Chai, Yun Fu, Sergey Tulyakov
[pdf]
[DOI]

KD-MVS: Knowledge Distillation Based Self-Supervised Learning for Multi-View Stereo
Yikang Ding, Qingtian Zhu, Xiangyue Liu, Wentao Yuan, Haotian Zhang, Chi Zhang
[pdf]
[DOI]

SALVe: Semantic Alignment Verification for Floorplan Reconstruction from Sparse Panoramas
John Lambert, Yuguang Li, Ivaylo Boyadzhiev, Lambert Wixson, Manjunath Narayana, Will Hutchcroft, James Hays, Frank Dellaert, Sing Bing Kang
[pdf]
[DOI]

RC-MVSNet: Unsupervised Multi-View Stereo with Neural Rendering
Di Chang, Aljaž Božič, Tong Zhang, Qingsong Yan, Yingcong Chen, Sabine Süsstrunk, Matthias Nießner
[pdf]
[DOI]

Box2Mask: Weakly Supervised 3D Semantic Instance Segmentation Using Bounding Boxes
Julian Chibane, Francis Engelmann, Tuan Anh Tran, Gerard Pons-Moll
[pdf]
[DOI]

NeILF: Neural Incident Light Field for Physically-Based Material Estimation
Yao Yao, Jingyang Zhang, Jingbo Liu, Yihang Qu, Tian Fang, David McKinnon, Yanghai Tsin, Long Quan
[pdf]
[DOI]

ARF: Artistic Radiance Fields
Kai Zhang, Nick Kolkin, Sai Bi, Fujun Luan, Zexiang Xu, Eli Shechtman, Noah Snavely
[pdf]
[DOI]

Multiview Stereo with Cascaded Epipolar RAFT
Zeyu Ma, Zachary Teed, Jia Deng
[pdf]
[DOI]

ARAH: Animatable Volume Rendering of Articulated Human SDFs
Shaofei Wang, Katja Schwarz, Andreas Geiger, Siyu Tang
[pdf]
[DOI]

ASpanFormer: Detector-Free Image Matching with Adaptive Span Transformer
Hongkai Chen, Zixin Luo, Lei Zhou, Yurun Tian, Mingmin Zhen, Tian Fang, David McKinnon, Yanghai Tsin, Long Quan
[pdf]
[DOI]

NDF: Neural Deformable Fields for Dynamic Human Modelling
Ruiqi Zhang, Jie Chen
[pdf]
[DOI]

Neural Density-Distance Fields
Itsuki Ueda, Yoshihiro Fukuhara, Hirokatsu Kataoka, Hiroaki Aizawa, Hidehiko Shishido, Itaru Kitahara
[pdf]
[DOI]

NeXT: Towards High Quality Neural Radiance Fields via Multi-Skip Transformer
Yunxiao Wang, Yanjie Li, Peidong Liu, Tao Dai, Shu-Tao Xia
[pdf]
[DOI]

Learning Online Multi-sensor Depth Fusion
Erik Sandström, Martin R. Oswald, Suryansh Kumar, Silvan Weder, Fisher Yu, Cristian Sminchisescu, Luc Van Gool
[pdf]
[DOI]

BungeeNeRF: Progressive Neural Radiance Field for Extreme Multi-Scale Scene Rendering
Yuanbo Xiangli, Linning Xu, Xingang Pan, Nanxuan Zhao, Anyi Rao, Christian Theobalt, Bo Dai, Dahua Lin
[pdf]
[DOI]

Decomposing the Tangent of Occluding Boundaries according to Curvatures and Torsions
Huizong Yang, Anthony Yezzi
[pdf]
[DOI]

NeuRIS: Neural Reconstruction of Indoor Scenes Using Normal Priors
Jiepeng Wang, Peng Wang, Xiaoxiao Long, Christian Theobalt, Taku Komura, Lingjie Liu, Wenping Wang
[pdf]
[DOI]

Generalizable Patch-Based Neural Rendering
Mohammed Suhail, Carlos Esteves, Leonid Sigal, Ameesh Makadia
[pdf]
[DOI]

Improving RGB-D Point Cloud Registration by Learning Multi-Scale Local Linear Transformation
Ziming Wang, Xiaoliang Huo, Zhenghao Chen, Jing Zhang, Lu Sheng, Dong Xu
[pdf]
[DOI]

Real-Time Neural Character Rendering with Pose-Guided Multiplane Images
Hao Ouyang, Bo Zhang, Pan Zhang, Hao Yang, Jiaolong Yang, Dong Chen, Qifeng Chen, Fang Wen
[pdf]
[DOI]

SparseNeuS: Fast Generalizable Neural Surface Reconstruction from Sparse Views
Xiaoxiao Long, Cheng Lin, Peng Wang, Taku Komura, Wenping Wang
[pdf]
[DOI]

Disentangling Object Motion and Occlusion for Unsupervised Multi-Frame Monocular Depth
Ziyue Feng, Liang Yang, Longlong Jing, Haiyan Wang, YingLi Tian, Bing Li
[pdf]
[DOI]

Depth Field Networks for Generalizable Multi-View Scene Representation
Vitor Guizilini, Igor Vasiljevic, Jiading Fang, Rareș Ambruș, Greg Shakhnarovich, Matthew R. Walter, Adrien Gaidon
[pdf]
[DOI]

Context-Enhanced Stereo Transformer
Weiyu Guo, Zhaoshuo Li, Yongkui Yang, Zheng Wang, Russell H. Taylor, Mathias Unberath, Alan Yuille, Yingwei Li
[pdf]
[DOI]

PCW-Net: Pyramid Combination and Warping Cost Volume for Stereo Matching
Zhelun Shen, Yuchao Dai, Xibin Song, Zhibo Rao, Dingfu Zhou, Liangjun Zhang
[pdf]
[DOI]

Gen6D: Generalizable Model-Free 6-DoF Object Pose Estimation from RGB Images
Yuan Liu, Yilin Wen, Sida Peng, Cheng Lin, Xiaoxiao Long, Taku Komura, Wenping Wang
[pdf]
[DOI]

Latency-Aware Collaborative Perception
Zixing Lei, Shunli Ren, Yue Hu, Wenjun Zhang, Siheng Chen
[pdf]
[DOI]

TensoRF: Tensorial Radiance Fields
Anpei Chen, Zexiang Xu, Andreas Geiger, Jingyi Yu, Hao Su
[pdf]
[DOI]

NeFSAC: Neurally Filtered Minimal Samples
Luca Cavalli, Marc Pollefeys, Daniel Barath
[pdf]
[DOI]

SNeS: Learning Probably Symmetric Neural Surfaces from Incomplete Data
Eldar Insafutdinov, Dylan Campbell, João F. Henriques, Andrea Vedaldi
[pdf]
[DOI]

HDR-Plenoxels: Self-Calibrating High Dynamic Range Radiance Fields
Kim Jun-Seong, Kim Yu-Ji, Moon Ye-Bin, Tae-Hyun Oh
[pdf]
[DOI]

NeuMan: Neural Human Radiance Field from a Single Video
Wei Jiang, Kwang Moo Yi, Golnoosh Samei, Oncel Tuzel, Anurag Ranjan
[pdf]
[DOI]

TAVA: Template-Free Animatable Volumetric Actors
Ruilong Li, Julian Tanke, Minh Vo, Michael Zollhöfer, Jürgen Gall, Angjoo Kanazawa, Christoph Lassner
[pdf]
[DOI]

EASNet: Searching Elastic and Accurate Network Architecture for Stereo Matching
Qiang Wang, Shaohuai Shi, Kaiyong Zhao, Xiaowen Chu
[pdf]
[DOI]

Relative Pose from SIFT Features
Daniel Barath, Zuzana Kukelova
[pdf]
[DOI]

Selection and Cross Similarity for Event-Image Deep Stereo
Hoonhee Cho, Kuk-Jin Yoon
[pdf]
[DOI]

D3Net: A Unified Speaker-Listener Architecture for 3D Dense Captioning and Visual Grounding
Zhenyu Chen, Qirui Wu, Matthias Nießner, Angel X. Chang
[pdf]
[DOI]

CIRCLE: Convolutional Implicit Reconstruction and Completion for Large-Scale Indoor Scene
Hao-Xiang Chen, Jiahui Huang, Tai-Jiang Mu, Shi-Min Hu
[pdf]
[DOI]

ParticleSfM: Exploiting Dense Point Trajectories for Localizing Moving Cameras in the Wild
Wang Zhao, Shaohui Liu, Hengkai Guo, Wenping Wang, Yong-Jin Liu
[pdf]
[DOI]

4DContrast: Contrastive Learning with Dynamic Correspondences for 3D Scene Understanding
Yujin Chen, Matthias Nießner, Angela Dai
[pdf]
[DOI]

Few ‘Zero Level Set’-Shot Learning of Shape Signed Distance Functions in Feature Space
Amine Ouasfi, Adnane Boukhayma
[pdf]
[DOI]

Solution Space Analysis of Essential Matrix Based on Algebraic Error Minimization
Gaku Nakano
[pdf]
[DOI]

Approximate Differentiable Rendering with Algebraic Surfaces
Leonid Keselman, Martial Hebert
[pdf]
[DOI]

CoVisPose: Co-Visibility Pose Transformer for Wide-Baseline Relative Pose Estimation in 360° Indoor Panoramas
Will Hutchcroft, Yuguang Li, Ivaylo Boyadzhiev, Zhiqiang Wan, Haiyan Wang, Sing Bing Kang
[pdf]
[DOI]

Affine Correspondences between Multi-Camera Systems for 6DOF Relative Pose Estimation
Banglei Guan, Ji Zhao
[pdf]
[DOI]

GraphFit: Learning Multi-Scale Graph-Convolutional Representation for Point Cloud Normal Estimation
Keqiang Li, Mingyang Zhao, Huaiyu Wu, Dong-Ming Yan, Zhen Shen, Fei-Yue Wang, Gang Xiong
[pdf]
[DOI]

IS-MVSNet: Importance Sampling-Based MVSNet
Likang Wang, Yue Gong, Xinjun Ma, Qirui Wang, Kaixuan Zhou, Lei Chen
[pdf]
[DOI]

Point Scene Understanding via Disentangled Instance Mesh Reconstruction
Jiaxiang Tang, Xiaokang Chen, Jingbo Wang, Gang Zeng
[pdf]
[DOI]

DiffuStereo: High Quality Human Reconstruction via Diffusion-Based Stereo Using Sparse Cameras
Ruizhi Shao, Zerong Zheng, Hongwen Zhang, Jingxiang Sun, Yebin Liu
[pdf]
[DOI]

Space-Partitioning RANSAC
Daniel Barath, Gábor Valasek
[pdf]
[DOI]

SimpleRecon: 3D Reconstruction without 3D Convolutions
Mohamed Sayed, John Gibson, Jamie Watson, Victor Prisacariu, Michael Firman, Clément Godard
[pdf]
[DOI]

Structure and Motion from Casual Videos
Zhoutong Zhang, Forrester Cole, Zhengqi Li, Noah Snavely, Michael Rubinstein, William T. Freeman
[pdf]
[DOI]

What Matters for 3D Scene Flow Network
Guangming Wang, Yunzhe Hu, Zhe Liu, Yiyang Zhou, Masayoshi Tomizuka, Wei Zhan, Hesheng Wang
[pdf]
[DOI]

Correspondence Reweighted Translation Averaging
Lalit Manam, Venu Madhav Govindu
[pdf]
[DOI]

Neural Strands: Learning Hair Geometry and Appearance from Multi-View Images
Radu Alexandru Rosu, Shunsuke Saito, Ziyan Wang, Chenglei Wu, Sven Behnke, Giljoo Nam
[pdf]
[DOI]

GraphCSPN: Geometry-Aware Depth Completion via Dynamic GCNs
Xin Liu, Xiaofei Shao, Bo Wang, Yali Li, Shengjin Wang
[pdf]
[DOI]

Objects Can Move: 3D Change Detection by Geometric Transformation Consistency
Aikaterini Adam, Torsten Sattler, Konstantinos Karantzalos, Tomas Pajdla
[pdf]
[DOI]

Language-Grounded Indoor 3D Semantic Segmentation in the Wild
Dávid Rozenberszki, Or Litany, Angela Dai
[pdf]
[DOI]

Beyond Periodicity: Towards a Unifying Framework for Activations in Coordinate-MLPs
Sameera Ramasinghe, Simon Lucey
[pdf]
[DOI]

Deforming Radiance Fields with Cages
Tianhan Xu, Tatsuya Harada
[pdf]
[DOI]

FLEX: Extrinsic Parameters-Free Multi-View 3D Human Motion Reconstruction
Brian Gordon, Sigal Raab, Guy Azov, Raja Giryes, Daniel Cohen-Or
[pdf]
[DOI]

MODE: Multi-View Omnidirectional Depth Estimation with 360° Cameras
Ming Li, Xueqian Jin, Xuejiao Hu, Jingzhao Dai, Sidan Du, Yang Li
[pdf]
[DOI]

GigaDepth: Learning Depth from Structured Light with Branching Neural Networks
Simon Schreiberhuber, Jean-Baptiste Weibel, Timothy Patten, Markus Vincze
[pdf]
[DOI]

ActiveNeRF: Learning Where to See with Uncertainty Estimation
Xuran Pan, Zihang Lai, Shiji Song, Gao Huang
[pdf]
[DOI]

PoserNet: Refining Relative Camera Poses Exploiting Object Detections
Matteo Taiana, Matteo Toso, Stuart James, Alessio Del Bue
[pdf]
[DOI]

Gaussian Activated Neural Radiance Fields for High Fidelity Reconstruction & Pose Estimation
Shin-Fang Chng, Sameera Ramasinghe, Jamie Sherrah, Simon Lucey
[pdf]
[DOI]

Unbiased Gradient Estimation for Differentiable Surface Splatting via Poisson Sampling
Jan U. Müller, Michael Weinmann, Reinhard Klein
[pdf]
[DOI]

Towards Learning Neural Representations from Shadows
Kushagra Tiwary, Tzofi Klinghoffer, Ramesh Raskar
[pdf]
[DOI]

Class-Incremental Novel Class Discovery
Subhankar Roy, Mingxuan Liu, Zhun Zhong, Nicu Sebe, Elisa Ricci
[pdf]
[DOI]

Unknown-Oriented Learning for Open Set Domain Adaptation
Jie Liu, Xiaoqing Guo, Yixuan Yuan
[pdf]
[DOI]

Prototype-Guided Continual Adaptation for Class-Incremental Unsupervised Domain Adaptation
Hongbin Lin, Yifan Zhang, Zhen Qiu, Shuaicheng Niu, Chuang Gan, Yanxia Liu, Mingkui Tan
[pdf]
[DOI]

DecoupleNet: Decoupled Network for Domain Adaptive Semantic Segmentation
Xin Lai, Zhuotao Tian, Xiaogang Xu, Yingcong Chen, Shu Liu, Hengshuang Zhao, Liwei Wang, Jiaya Jia
[pdf]
[DOI]

Class-Agnostic Object Counting Robust to Intraclass Diversity
Shenjian Gong, Shanshan Zhang, Jian Yang, Dengxin Dai, Bernt Schiele
[pdf]
[DOI]

Burn after Reading: Online Adaptation for Cross-Domain Streaming Data
Luyu Yang, Mingfei Gao, Zeyuan Chen, Ran Xu, Abhinav Shrivastava, Chetan Ramaiah
[pdf]
[DOI]

Mind the Gap in Distilling StyleGANs
Guodong Xu, Yuenan Hou, Ziwei Liu, Chen Change Loy
[pdf]
[DOI]

Improving Test-Time Adaptation via Shift-Agnostic Weight Regularization and Nearest Source Prototypes
Sungha Choi, Seunghan Yang, Seokeon Choi, Sungrack Yun
[pdf]
[DOI]

Learning Instance-Specific Adaptation for Cross-Domain Segmentation
Yuliang Zou, Zizhao Zhang, Chun-Liang Li, Han Zhang, Tomas Pfister, Jia-Bin Huang
[pdf]
[DOI]

RegionCL: Exploring Contrastive Region Pairs for Self-Supervised Representation Learning
Yufei Xu, Qiming Zhang, Jing Zhang, Dacheng Tao
[pdf]
[DOI]

Long-Tailed Class Incremental Learning
Xialei Liu, Yu-Song Hu, Xu-Sheng Cao, Andrew D. Bagdanov, Ke Li, Ming-Ming Cheng
[pdf]
[DOI]

DLCFT: Deep Linear Continual Fine-Tuning for General Incremental Learning
Hyounguk Shon, Janghyeon Lee, Seung Hwan Kim, Junmo Kim
[pdf]
[DOI]

Adversarial Partial Domain Adaptation by Cycle Inconsistency
Kun-Yu Lin, Jiaming Zhou, Yukun Qiu, Wei-Shi Zheng
[pdf]
[DOI]

Combating Label Distribution Shift for Active Domain Adaptation
Sehyun Hwang, Sohyun Lee, Sungyeon Kim, Jungseul Ok, Suha Kwak
[pdf]
[DOI]

GIPSO: Geometrically Informed Propagation for Online Adaptation in 3D LiDAR Segmentation
Cristiano Saltori, Evgeny Krivosheev, Stéphane Lathuilière, Nicu Sebe, Fabio Galasso, Giuseppe Fiameni, Elisa Ricci, Fabio Poiesi
[pdf]
[DOI]

CoSMix: Compositional Semantic Mix for Domain Adaptation in 3D LiDAR Segmentation
Cristiano Saltori, Fabio Galasso, Giuseppe Fiameni, Nicu Sebe, Elisa Ricci, Fabio Poiesi
[pdf]
[DOI]

A Unified Framework for Domain Adaptive Pose Estimation
Donghyun Kim, Kaihong Wang, Kate Saenko, Margrit Betke, Stan Sclaroff
[pdf]
[DOI]

A Broad Study of Pre-training for Domain Generalization and Adaptation
Donghyun Kim, Kaihong Wang, Stan Sclaroff, Kate Saenko
[pdf]
[DOI]

Prior Knowledge Guided Unsupervised Domain Adaptation
Tao Sun, Cheng Lu, Haibin Ling
[pdf]
[DOI]

GCISG: Guided Causal Invariant Learning for Improved Syn-to-Real Generalization
Gilhyun Nam, Gyeongjae Choi, Kyungmin Lee
[pdf]
[DOI]

AcroFOD: An Adaptive Method for Cross-Domain Few-Shot Object Detection
Yipeng Gao, Lingxiao Yang, Yunmu Huang, Song Xie, Shiyong Li, Wei-Shi Zheng
[pdf]
[DOI]

Unsupervised Domain Adaptation for One-Stage Object Detector Using Offsets to Bounding Box
Jayeon Yoo, Inseop Chung, Nojun Kwak
[pdf]
[DOI]

Visual Prompt Tuning
Menglin Jia, Luming Tang, Bor-Chun Chen, Claire Cardie, Serge Belongie, Bharath Hariharan, Ser-Nam Lim
[pdf]
[DOI]

Quasi-Balanced Self-Training on Noise-Aware Synthesis of Object Point Clouds for Closing Domain Gap
Yongwei Chen, Zihao Wang, Longkun Zou, Ke Chen, Kui Jia
[pdf]
[DOI]

Interpretable Open-Set Domain Adaptation via Angular Margin Separation
Xinhao Li, Jingjing Li, Zhekai Du, Lei Zhu, Wen Li
[pdf]
[DOI]

TACS: Taxonomy Adaptive Cross-Domain Semantic Segmentation
Rui Gong, Martin Danelljan, Dengxin Dai, Danda Pani Paudel, Ajad Chhatkuli, Fisher Yu, Luc Van Gool
[pdf]
[DOI]

Prototypical Contrast Adaptation for Domain Adaptive Semantic Segmentation
Zhengkai Jiang, Yuxi Li, Ceyuan Yang, Peng Gao, Yabiao Wang, Ying Tai, Chengjie Wang
[pdf]
[DOI]

RBC: Rectifying the Biased Context in Continual Semantic Segmentation
Hanbin Zhao, Fengyu Yang, Xinghe Fu, Xi Li
[pdf]
[DOI]

Factorizing Knowledge in Neural Networks
Xingyi Yang, Jingwen Ye, Xinchao Wang
[pdf]
[DOI]

Contrastive Vicinal Space for Unsupervised Domain Adaptation
Jaemin Na, Dongyoon Han, Hyung Jin Chang, Wonjun Hwang
[pdf]
[DOI]

Cross-Modal Knowledge Transfer without Task-Relevant Source Data
Sk Miraj Ahmed, Suhas Lohit, Kuan-Chuan Peng, Michael J. Jones, Amit K. Roy-Chowdhury
[pdf]
[DOI]

Online Domain Adaptation for Semantic Segmentation in Ever-Changing Conditions
Theodoros Panagiotakopoulos, Pier Luigi Dovesi, Linus Härenstam-Nielsen, Matteo Poggi
[pdf]
[DOI]

Source-Free Video Domain Adaptation by Learning Temporal Consistency for Action Recognition
Yuecong Xu, Jianfei Yang, Haozhi Cao, Keyu Wu, Min Wu, Zhenghua Chen
[pdf]
[DOI]

BMD: A General Class-Balanced Multicentric Dynamic Prototype Strategy for Source-Free Domain Adaptation
Sanqing Qu, Guang Chen, Jing Zhang, Zhijun Li, Wei He, Dacheng Tao
[pdf]
[DOI]

Generalized Brain Image Synthesis with Transferable Convolutional Sparse Coding Networks
Yawen Huang, Feng Zheng, Xu Sun, Yuexiang Li, Ling Shao, Yefeng Zheng
[pdf]
[DOI]

Incomplete Multi-View Domain Adaptation via Channel Enhancement and Knowledge Transfer
Haifeng Xia, Pu Wang, Zhengming Ding
[pdf]
[DOI]

DistPro: Searching a Fast Knowledge Distillation Process via Meta Optimization
Xueqing Deng, Dawei Sun, Shawn Newsam, Peng Wang
[pdf]
[DOI]

ML-BPM: Multi-Teacher Learning with Bidirectional Photometric Mixing for Open Compound Domain Adaptation in Semantic Segmentation
Fei Pan, Sungsu Hur, Seokju Lee, Junsik Kim, In So Kweon
[pdf]
[DOI]

PACTran: PAC-Bayesian Metrics for Estimating the Transferability of Pretrained Models to Classification Tasks
Nan Ding, Xi Chen, Tomer Levinboim, Soravit Changpinyo, Radu Soricut
[pdf]
[DOI]

Personalized Education: Blind Knowledge Distillation
Xiang Deng, Jian Zheng, Zhongfei Zhang
[pdf]
[DOI]

Not All Models Are Equal: Predicting Model Transferability in a Self-Challenging Fisher Space
Wenqi Shao, Xun Zhao, Yixiao Ge, Zhaoyang Zhang, Lei Yang, Xiaogang Wang, Ying Shan, Ping Luo
[pdf]
[DOI]

How Stable Are Transferability Metrics Evaluations?
Andrea Agostinelli, Michal Pándy, Jasper Uijlings, Thomas Mensink, Vittorio Ferrari
[pdf]
[DOI]

Attention Diversification for Domain Generalization
Rang Meng, Xianfeng Li, Weijie Chen, Shicai Yang, Jie Song, Xinchao Wang, Lei Zhang, Mingli Song, Di Xie, Shiliang Pu
[pdf]
[DOI]

ESS: Learning Event-Based Semantic Segmentation from Still Images
Zhaoning Sun, Nico Messikommer, Daniel Gehrig, Davide Scaramuzza
[pdf]
[DOI]

An Efficient Spatio-Temporal Pyramid Transformer for Action Detection
Yuetian Weng, Zizheng Pan, Mingfei Han, Xiaojun Chang, Bohan Zhuang
[pdf]
[DOI]

Human Trajectory Prediction via Neural Social Physics
Jiangbei Yue, Dinesh Manocha, He Wang
[pdf]
[DOI]

Towards Open Set Video Anomaly Detection
Yuansheng Zhu, Wentao Bao, Qi Yu
[pdf]
[DOI]

ECLIPSE: Efficient Long-Range Video Retrieval Using Sight and Sound
Yan-Bo Lin, Jie Lei, Mohit Bansal, Gedas Bertasius
[pdf]
[DOI]

Joint-Modal Label Denoising for Weakly-Supervised Audio-Visual Video Parsing
Haoyue Cheng, Zhaoyang Liu, Hang Zhou, Chen Qian, Wayne Wu, Limin Wang
[pdf]
[DOI]

Less than Few: Self-Shot Video Instance Segmentation
Pengwan Yang, Yuki M. Asano, Pascal Mettes, Cees G. M. Snoek
[pdf]
[DOI]

Adaptive Face Forgery Detection in Cross Domain
Luchuan Song, Zheng Fang, Xiaodan Li, Xiaoyi Dong, Zhenchao Jin, Yuefeng Chen, Siwei Lyu
[pdf]
[DOI]

Real-Time Online Video Detection with Temporal Smoothing Transformers
Yue Zhao, Philipp Krähenbühl
[pdf]
[DOI]

TALLFormer: Temporal Action Localization with a Long-Memory Transformer
Feng Cheng, Gedas Bertasius
[pdf]
[DOI]

Mining Relations among Cross-Frame Affinities for Video Semantic Segmentation
Guolei Sun, Yun Liu, Hao Tang, Ajad Chhatkuli, Le Zhang, Luc Van Gool
[pdf]
[DOI]

TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency
Medhini Narasimhan, Arsha Nagrani, Chen Sun, Michael Rubinstein, Trevor Darrell, Anna Rohrbach, Cordelia Schmid
[pdf]
[DOI]

Rethinking Learning Approaches for Long-Term Action Anticipation
Megha Nawhal, Akash Abdu Jyothi, Greg Mori
[pdf]
[DOI]

DualFormer: Local-Global Stratified Transformer for Efficient Video Recognition
Yuxuan Liang, Pan Zhou, Roger Zimmermann, Shuicheng Yan
[pdf]
[DOI]

Hierarchical Feature Alignment Network for Unsupervised Video Object Segmentation
Gensheng Pei, Fumin Shen, Yazhou Yao, Guo-Sen Xie, Zhenmin Tang, Jinhui Tang
[pdf]
[DOI]

PAC-Net: Highlight Your Video via History Preference Modeling
Hang Wang, Penghao Zhou, Chong Zhou, Zhao Zhang, Xing Sun
[pdf]
[DOI]

How Severe Is Benchmark-Sensitivity in Video Self-Supervised Learning?
Fida Mohammad Thoker, Hazel Doughty, Piyush Bagad, Cees G. M. Snoek
[pdf]
[DOI]

A Sliding Window Scheme for Online Temporal Action Localization
Young Hwi Kim, Hyolim Kang, Seon Joo Kim
[pdf]
[DOI]

ERA: Expert Retrieval and Assembly for Early Action Prediction
Lin Geng Foo, Tianjiao Li, Hossein Rahmani, Qiuhong Ke, Jun Liu
[pdf]
[DOI]

Dual Perspective Network for Audio-Visual Event Localization
Varshanth Rao, Md Ibrahim Khalil, Haoda Li, Peng Dai, Juwei Lu
[pdf]
[DOI]

NSNet: Non-Saliency Suppression Sampler for Efficient Video Recognition
Boyang Xia, Wenhao Wu, Haoran Wang, Rui Su, Dongliang He, Haosen Yang, Xiaoran Fan, Wanli Ouyang
[pdf]
[DOI]

Video Activity Localisation with Uncertainties in Temporal Boundary
Jiabo Huang, Hailin Jin, Shaogang Gong, Yang Liu
[pdf]
[DOI]

Temporal Saliency Query Network for Efficient Video Recognition
Boyang Xia, Zhihao Wang, Wenhao Wu, Haoran Wang, Jungong Han
[pdf]
[DOI]

Efficient One-Stage Video Object Detection by Exploiting Temporal Consistency
Guanxiong Sun, Yang Hua, Guosheng Hu, Neil Robertson
[pdf]
[DOI]

Leveraging Action Affinity and Continuity for Semi-Supervised Temporal Action Segmentation
Guodong Ding, Angela Yao
[pdf]
[DOI]

"Spotting Temporally Precise, Fine-Grained Events in Video"
James Hong, Haotian Zhang, Michaël Gharbi, Matthew Fisher, Kayvon Fatahalian
[pdf]
[DOI]

Unified Fully and Timestamp Supervised Temporal Action Segmentation via Sequence to Sequence Translation
Nadine Behrmann, S. Alireza Golestaneh, Zico Kolter, Jürgen Gall, Mehdi Noroozi
[pdf]
[DOI]

Efficient Video Transformers with Spatial-Temporal Token Selection
Junke Wang, Xitong Yang, Hengduo Li, Li Liu, Zuxuan Wu, Yu-Gang Jiang
[pdf]
[DOI]

Long Movie Clip Classification with State-Space Video Models
Md Mohaiminul Islam, Gedas Bertasius
[pdf]
[DOI]

Prompting Visual-Language Models for Efficient Video Understanding
Chen Ju, Tengda Han, Kunhao Zheng, Ya Zhang, Weidi Xie
[pdf]
[DOI]

Asymmetric Relation Consistency Reasoning for Video Relation Grounding
Huan Li, Ping Wei, Jiapeng Li, Zeyu Ma, Jiahui Shang, Nanning Zheng
[pdf]
[DOI]

Self-Supervised Social Relation Representation for Human Group Detection
Jiacheng Li, Ruize Han, Haomin Yan, Zekun Qian, Wei Feng, Song Wang
[pdf]
[DOI]

K-Centered Patch Sampling for Efficient Video Recognition
Seong Hyeon Park, Jihoon Tack, Byeongho Heo, Jung-Woo Ha, Jinwoo Shin
[pdf]
[DOI]

A Deep Moving-Camera Background Model
Guy Erez, Ron Shapira Weber, Oren Freifeld
[pdf]
[DOI]

GraphVid: It Only Takes a Few Nodes to Understand a Video
Eitan Kosman, Dotan Di Castro
[pdf]
[DOI]

Delta Distillation for Efficient Video Processing
Amirhossein Habibian, Haitam Ben Yahia, Davide Abati, Efstratios Gavves, Fatih Porikli
[pdf]
[DOI]

MorphMLP: An Efficient MLP-Like Backbone for Spatial-Temporal Representation Learning
David Junhao Zhang, Kunchang Li, Yali Wang, Yunpeng Chen, Shashwat Chandra, Yu Qiao, Luoqi Liu, Mike Zheng Shou
[pdf]
[DOI]

COMPOSER: Compositional Reasoning of Group Activity in Videos with Keypoint-Only Modality
Honglu Zhou, Asim Kadav, Aviv Shamsian, Shijie Geng, Farley Lai, Long Zhao, Ting Liu, Mubbasir Kapadia, Hans Peter Graf
[pdf]
[DOI]

E-NeRV: Expedite Neural Video Representation with Disentangled Spatial-Temporal Context
Zizhang Li, Mengmeng Wang, Huaijin Pi, Kechun Xu, Jianbiao Mei, Yong Liu
[pdf]
[DOI]

TDViT: Temporal Dilated Video Transformer for Dense Video Tasks
Guanxiong Sun, Yang Hua, Guosheng Hu, Neil Robertson
[pdf]
[DOI]

Semi-Supervised Learning of Optical Flow by Flow Supervisor
Woobin Im, Sebin Lee, Sung-Eui Yoon
[pdf]
[DOI]

Flow Graph to Video Grounding for Weakly-Supervised Multi-step Localization
Nikita Dvornik, Isma Hadji, Hai Pham, Dhaivat Bhatt, Brais Martinez, Afsaneh Fazly, Allan D. Jepson
[pdf]
[DOI]

Deep 360° Optical Flow Estimation Based on Multi-Projection Fusion
Yiheng Li, Connelly Barnes, Kun Huang, Fang-Lue Zhang
[pdf]
[DOI]

MaCLR: Motion-Aware Contrastive Learning of Representations for Videos
Fanyi Xiao, Joseph Tighe, Davide Modolo
[pdf]
[DOI]

Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection
Kyle Min, Sourya Roy, Subarna Tripathi, Tanaya Guha, Somdeb Majumdar
[pdf]
[DOI]

Frozen CLIP Models Are Efficient Video Learners
Ziyi Lin, Shijie Geng, Renrui Zhang, Peng Gao, Gerard de Melo, Xiaogang Wang, Jifeng Dai, Yu Qiao, Hongsheng Li
[pdf]
[DOI]

PIP: Physical Interaction Prediction via Mental Simulation with Span Selection
Jiafei Duan, Samson Yu, Soujanya Poria, Bihan Wen, Cheston Tan
[pdf]
[DOI]

Panoramic Vision Transformer for Saliency Detection in 360° Videos
Heeseung Yun, Sehun Lee, Gunhee Kim
[pdf]
[DOI]

Bayesian Tracking of Video Graphs Using Joint Kalman Smoothing and Registration
Aditi Basu Bal, Ramy Mounir, Sathyanarayanan Aakur, Sudeep Sarkar, Anuj Srivastava
[pdf]
[DOI]

Motion Sensitive Contrastive Learning for Self-Supervised Video Representation
Jingcheng Ni, Nan Zhou, Jie Qin, Qian Wu, Junqi Liu, Boxun Li, Di Huang
[pdf]
[DOI]

Dynamic Temporal Filtering In Video Models
Fuchen Long, Zhaofan Qiu, Yingwei Pan, Ting Yao, Chong-Wah Ngo, Tao Mei
[pdf]
[DOI]

Tip-Adapter: Training-Free Adaption of CLIP for Few-Shot Classification
Renrui Zhang, Wei Zhang, Rongyao Fang, Peng Gao, Kunchang Li, Jifeng Dai, Yu Qiao, Hongsheng Li
[pdf]
[DOI]

Temporal Lift Pooling for Continuous Sign Language Recognition
Lianyu Hu, Liqing Gao, Zekang Liu, Wei Feng
[pdf]
[DOI]

MORE: Multi-Order RElation Mining for Dense Captioning in 3D Scenes
Yang Jiao, Shaoxiang Chen, Zequn Jie, Jingjing Chen, Lin Ma, Yu-Gang Jiang
[pdf]
[DOI]

SiRi: A Simple Selective Retraining Mechanism for Transformer-Based Visual Grounding
Mengxue Qu, Yu Wu, Wu Liu, Qiqi Gong, Xiaodan Liang, Olga Russakovsky, Yao Zhao, Yunchao Wei
[pdf]
[DOI]

Cross-Modal Prototype Driven Network for Radiology Report Generation
Jun Wang, Abhir Bhalerao, Yulan He
[pdf]
[DOI]

TM2T: Stochastic and Tokenized Modeling for the Reciprocal Generation of 3D Human Motions and Texts
Chuan Guo, Xinxin Zuo, Sen Wang, Li Cheng
[pdf]
[DOI]

SeqTR: A Simple Yet Universal Network for Visual Grounding
Chaoyang Zhu, Yiyi Zhou, Yunhang Shen, Gen Luo, Xingjia Pan, Mingbao Lin, Chao Chen, Liujuan Cao, Xiaoshuai Sun, Rongrong Ji
[pdf]
[DOI]

VTC: Improving Video-Text Retrieval with User Comments
Laura Hanu, James Thewlis, Yuki M. Asano, Christian Rupprecht
[pdf]
[DOI]

FashionViL: Fashion-Focused Vision-and-Language Representation Learning
Xiao Han, Licheng Yu, Xiatian Zhu, Li Zhang, Yi-Zhe Song, Tao Xiang
[pdf]
[DOI]

Weakly Supervised Grounding for VQA in Vision-Language Transformers
Aisha Urooj, Hilde Kuehne, Chuang Gan, Niels Da Vitoria Lobo, Mubarak Shah
[pdf]
[DOI]

Automatic Dense Annotation of Large-Vocabulary Sign Language Videos
Liliane Momeni, Hannah Bull, K R Prajwal, Samuel Albanie, Gül Varol, Andrew Zisserman
[pdf]
[DOI]

MILES: Visual BERT Pre-training with Injected Language Semantics for Video-Text Retrieval
Yuying Ge, Yixiao Ge, Xihui Liu, Jinpeng Wang, Jianping Wu, Ying Shan, Xiaohu Qie, Ping Luo
[pdf]
[DOI]

"GEB+: A Benchmark for Generic Event Boundary Captioning, Grounding and Retrieval"
Yuxuan Wang, Difei Gao, Licheng Yu, Weixian Lei, Matt Feiszli, Mike Zheng Shou
[pdf]
[DOI]

A Simple and Robust Correlation Filtering Method for Text-Based Person Search
Wei Suo, Mengyang Sun, Kai Niu, Yiqi Gao, Peng Wang, Yanning Zhang, Qi Wu
[pdf]
[DOI]

Making the Most of Text Semantics to Improve Biomedical Vision-Language Processing
Benedikt Boecking, Naoto Usuyama, Shruthi Bannur, Daniel C. Castro, Anton Schwaighofer, Stephanie Hyland, Maria Wetscherek, Tristan Naumann, Aditya Nori, Javier Alvarez-Valle, Hoifung Poon, Ozan Oktay
[pdf]
[DOI]

Generative Negative Text Replay for Continual Vision-Language Pretraining
Shipeng Yan, Lanqing Hong, Hang Xu, Jianhua Han, Tinne Tuytelaars, Zhenguo Li, Xuming He
[pdf]
[DOI]

Video Graph Transformer for Video Question Answering
Junbin Xiao, Pan Zhou, Tat-Seng Chua, Shuicheng Yan
[pdf]
[DOI]

Trace Controlled Text to Image Generation
Kun Yan, Lei Ji, Chenfei Wu, Jianmin Bao, Ming Zhou, Nan Duan, Shuai Ma
[pdf]
[DOI]

Video Question Answering with Iterative Video-Text Co-Tokenization
AJ Piergiovanni, Kairo Morton, Weicheng Kuo, Michael S. Ryoo, Anelia Angelova
[pdf]
[DOI]

Rethinking Data Augmentation for Robust Visual Question Answering
Long Chen, Yuhang Zheng, Jun Xiao
[pdf]
[DOI]

Explicit Image Caption Editing
Zhen Wang, Long Chen, Wenbo Ma, Guangxing Han, Yulei Niu, Jian Shao, Jun Xiao
[pdf]
[DOI]

Can Shuffling Video Benefit Temporal Bias Problem: A Novel Training Framework for Temporal Grounding
Jiachang Hao, Haifeng Sun, Pengfei Ren, Jingyu Wang, Qi Qi, Jianxin Liao
[pdf]
[DOI]

Reliable Visual Question Answering: Abstain Rather Than Answer Incorrectly
Spencer Whitehead, Suzanne Petryk, Vedaad Shakib, Joseph Gonzalez, Trevor Darrell, Anna Rohrbach, Marcus Rohrbach
[pdf]
[DOI]

GRIT: Faster and Better Image Captioning Transformer Using Dual Visual Features
Van-Quang Nguyen, Masanori Suganuma, Takayuki Okatani
[pdf]
[DOI]

Selective Query-Guided Debiasing for Video Corpus Moment Retrieval
Sunjae Yoon, Ji Woo Hong, Eunseop Yoon, Dahyun Kim, Junyeong Kim, Hee Suk Yoon, Chang D. Yoo
[pdf]
[DOI]

Spatial and Visual Perspective-Taking via View Rotation and Relation Reasoning for Embodied Reference Understanding
Cheng Shi, Sibei Yang
[pdf]
[DOI]

Object-Centric Unsupervised Image Captioning
Zihang Meng, David Yang, Xuefei Cao, Ashish Shah, Ser-Nam Lim
[pdf]
[DOI]

Contrastive Vision-Language Pre-training with Limited Resources
Quan Cui, Boyan Zhou, Yu Guo, Weidong Yin, Hao Wu, Osamu Yoshie, Yubo Chen
[pdf]
[DOI]

Learning Linguistic Association towards Efficient Text-Video Retrieval
Sheng Fang, Shuhui Wang, Junbao Zhuo, Xinzhe Han, Qingming Huang
[pdf]
[DOI]

ASSISTER: Assistive Navigation via Conditional Instruction Generation
Zanming Huang, Zhongkai Shangguan, Jimuyang Zhang, Gilad Bar, Matthew Boyd, Eshed Ohn-Bar
[pdf]
[DOI]

X-DETR: A Versatile Architecture for Instance-Wise Vision-Language Tasks
Zhaowei Cai, Gukyeong Kwon, Avinash Ravichandran, Erhan Bas, Zhuowen Tu, Rahul Bhotika, Stefano Soatto
[pdf]
[DOI]

Learning Disentanglement with Decoupled Labels for Vision-Language Navigation
Wenhao Cheng, Xingping Dong, Salman Khan, Jianbing Shen
[pdf]
[DOI]

Switch-BERT: Learning to Model Multimodal Interactions by Switching Attention and Input
Qingpei Guo, Kaisheng Yao, Wei Chu
[pdf]
[DOI]

Word-Level Fine-Grained Story Visualization
Bowen Li
[pdf]
[DOI]

Unifying Event Detection and Captioning as Sequence Generation via Pre-training
Qi Zhang, Yuqing Song, Qin Jin
[pdf]
[DOI]

Multimodal Transformer with Variable-Length Memory for Vision-and-Language Navigation
Chuang Lin, Yi Jiang, Jianfei Cai, Lizhen Qu, Gholamreza Haffari, Zehuan Yuan
[pdf]
[DOI]

Fine-Grained Visual Entailment
Christopher Thomas, Yipeng Zhang, Shih-Fu Chang
[pdf]
[DOI]

Bottom Up Top down Detection Transformers for Language Grounding in Images and Point Clouds
Ayush Jain, Nikolaos Gkanatsios, Ishita Mediratta, Katerina Fragkiadaki
[pdf]
[DOI]

New Datasets and Models for Contextual Reasoning in Visual Dialog
Yifeng Zhang, Ming Jiang, Qi Zhao
[pdf]
[DOI]

VisageSynTalk: Unseen Speaker Video-to-Speech Synthesis via Speech-Visage Feature Selection
Joanna Hong, Minsu Kim, Yong Man Ro
[pdf]
[DOI]

Classification-Regression for Chart Comprehension
Matan Levy, Rami Ben-Ari, Dani Lischinski
[pdf]
[DOI]

AssistQ: Affordance-Centric Question-Driven Task Completion for Egocentric Assistant
Benita Wong, Joya Chen, You Wu, Stan Weixian Lei, Dongxing Mao, Difei Gao, Mike Zheng Shou
[pdf]
[DOI]

FindIt: Generalized Localization with Natural Language Queries
Weicheng Kuo, Fred Bertsch, Wei Li, AJ Piergiovanni, Mohammad Saffar, Anelia Angelova
[pdf]
[DOI]

UniTAB: Unifying Text and Box Outputs for Grounded Vision-Language Modeling
Zhengyuan Yang, Zhe Gan, Jianfeng Wang, Xiaowei Hu, Faisal Ahmed, Zicheng Liu, Yumao Lu, Lijuan Wang
[pdf]
[DOI]

Scaling Open-Vocabulary Image Segmentation with Image-Level Labels
Golnaz Ghiasi, Xiuye Gu, Yin Cui, Tsung-Yi Lin
[pdf]
[DOI]

The Abduction of Sherlock Holmes: A Dataset for Visual Abductive Reasoning
Jack Hessel, Jena D. Hwang, Jae Sung Park, Rowan Zellers, Chandra Bhagavatula, Anna Rohrbach, Kate Saenko, Yejin Choi
[pdf]
[DOI]

Speaker-Adaptive Lip Reading with User-Dependent Padding
Minsu Kim, Hyunjun Kim, Yong Man Ro
[pdf]
[DOI]

TISE: Bag of Metrics for Text-to-Image Synthesis Evaluation
Tan M. Dinh, Rang Nguyen, Binh-Son Hua
[pdf]
[DOI]

SemAug: Semantically Meaningful Image Augmentations for Object Detection through Language Grounding
Morgan Heisler, Amin Banitalebi-Dehkordi, Yong Zhang
[pdf]
[DOI]

Referring Object Manipulation of Natural Images with Conditional Classifier-Free Guidance
Myungsub Choi
[pdf]
[DOI]

NewsStories: Illustrating Articles with Visual Summaries
Reuben Tan, Bryan A. Plummer, Kate Saenko, JP Lewis, Avneesh Sud, Thomas Leung
[pdf]
[DOI]

Webly Supervised Concept Expansion for General Purpose Vision Models
Amita Kamath, Christopher Clark, Tanmay Gupta, Eric Kolve, Derek Hoiem, Aniruddha Kembhavi
[pdf]
[DOI]

FedVLN: Privacy-Preserving Federated Vision-and-Language Navigation
Kaiwen Zhou, Xin Eric Wang
[pdf]
[DOI]

CODER: Coupled Diversity-Sensitive Momentum Contrastive Learning for Image-Text Retrieval
Haoran Wang, Dongliang He, Wenhao Wu, Boyang Xia, Min Yang, Fu Li, Yunlong Yu, Zhong Ji, Errui Ding, Jingdong Wang
[pdf]
[DOI]

Language-Driven Artistic Style Transfer
Tsu-Jui Fu, Xin Eric Wang, William Yang Wang
[pdf]
[DOI]

Single-Stream Multi-level Alignment for Vision-Language Pretraining
Zaid Khan, Vijay Kumar B G, Xiang Yu, Samuel Schulter, Manmohan Chandraker, Yun Fu
[pdf]
[DOI]

Most and Least Retrievable Images in Visual-Language Query Systems
Liuwan Zhu, Rui Ning, Jiang Li, Chunsheng Xin, Hongyi Wu
[pdf]
[DOI]

Sports Video Analysis on Large-Scale Data
Dekun Wu, He Zhao, Xingce Bao, Richard P. Wildes
[pdf]
[DOI]

Grounding Visual Representations with Texts for Domain Generalization
Seonwoo Min, Nokyung Park, Siwon Kim, Seunghyun Park, Jinkyu Kim
[pdf]
[DOI]

Bridging the Visual Semantic Gap in VLN via Semantically Richer Instructions
Joaquín Ossandón, Benjamín Earle, Alvaro Soto
[pdf]
[DOI]

StoryDALL-E: Adapting Pretrained Text-to-Image Transformers for Story Continuation
Adyasha Maharana, Darryl Hannan, Mohit Bansal
[pdf]
[DOI]

VQGAN-CLIP: Open Domain Image Generation and Editing with Natural Language Guidance
Katherine Crowson, Stella Biderman, Daniel Kornis, Dashiell Stander, Eric Hallahan, Louis Castricato, Edward Raff
[pdf]
[DOI]

Semantic-Aware Implicit Neural Audio-Driven Video Portrait Generation
Xian Liu, Yinghao Xu, Qianyi Wu, Hang Zhou, Wayne Wu, Bolei Zhou
[pdf]
[DOI]

End-to-End Active Speaker Detection
Juan León Alcázar, Moritz Cordes, Chen Zhao, Bernard Ghanem
[pdf]
[DOI]

Emotion Recognition for Multiple Context Awareness
Dingkang Yang, Shuai Huang, Shunli Wang, Yang Liu, Peng Zhai, Liuzhen Su, Mingcheng Li, Lihua Zhang
[pdf]
[DOI]

Adaptive Fine-Grained Sketch-Based Image Retrieval
Ayan Kumar Bhunia, Aneeshan Sain, Parth Hiren Shah, Animesh Gupta, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song
[pdf]
[DOI]

Quantized GAN for Complex Music Generation from Dance Videos
Ye Zhu, Kyle Olszewski, Yu Wu, Panos Achlioptas, Menglei Chai, Yan Yan, Sergey Tulyakov
[pdf]
[DOI]

Uncertainty-Aware Multi-modal Learning via Cross-Modal Random Network Prediction
Hu Wang, Jianpeng Zhang, Yuanhong Chen, Congbo Ma, Jodie Avery, Louise Hull, Gustavo Carneiro
[pdf]
[DOI]

Localizing Visual Sounds the Easy Way
Shentong Mo, Pedro Morgado
[pdf]
[DOI]

Learning Visual Styles from Audio-Visual Associations
Tingle Li, Yichen Liu, Andrew Owens, Hang Zhao
[pdf]
[DOI]

Remote Respiration Monitoring of Moving Person Using Radio Signals
Jae-Ho Choi, Ki-Bong Kang, Kyung-Tae Kim
[pdf]
[DOI]

Camera Pose Estimation and Localization with Active Audio Sensing
Karren Yang, Michael Firman, Eric Brachmann, Clément Godard
[pdf]
[DOI]

PACS: A Dataset for Physical Audiovisual Commonsense Reasoning
Samuel Yu, Peter Wu, Paul Pu Liang, Ruslan Salakhutdinov, Louis-Philippe Morency
[pdf]
[DOI]

VoViT: Low Latency Graph-Based Audio-Visual Voice Separation Transformer
Juan F. Montesinos, Venkatesh S. Kadandale, Gloria Haro
[pdf]
[DOI]

Telepresence Video Quality Assessment
Zhenqiang Ying, Deepti Ghadiyaram, Alan Bovik
[pdf]
[DOI]

MultiMAE: Multi-modal Multi-task Masked Autoencoders
Roman Bachmann, David Mizrahi, Andrei Atanov, Amir Zamir
[pdf]
[DOI]

AudioScopeV2: Audio-Visual Attention Architectures for Calibrated Open-Domain On-Screen Sound Separation
Efthymios Tzinis, Scott Wisdom, Tal Remez, John R. Hershey
[pdf]
[DOI]

Audio—Visual Segmentation
Jinxing Zhou, Jianyuan Wang, Jiayi Zhang, Weixuan Sun, Jing Zhang, Stan Birchfield, Dan Guo, Lingpeng Kong, Meng Wang, Yiran Zhong
[pdf]
[DOI]

Unsupervised Night Image Enhancement: When Layer Decomposition Meets Light-Effects Suppression
Yeying Jin, Wenhan Yang, Robby T. Tan
[pdf]
[DOI]

Relationformer: A Unified Framework for Image-to-Graph Generation
Suprosanna Shit, Rajat Koner, Bastian Wittmann, Johannes Paetzold, Ivan Ezhov, Hongwei Li, Jiazhen Pan, Sahand Sharifzadeh, Georgios Kaissis, Volker Tresp, Bjoern Menze
[pdf]
[DOI]

GAMa: Cross-view Video Geo-localization
Shruti Vyas, Chen Chen, Mubarak Shah
[pdf]
[DOI]

Revisiting a kNN-based Image Classification System with High-capacity Storage
Kengo Nakata, Youyang Ng, Daisuke Miyashita, Asuka Maki, Yu-Chieh Lin, Jun Deguchi
[pdf]
[DOI]

Geometric Representation Learning for Document Image Rectification
Hao Feng, Wengang Zhou, Jiajun Deng, Yuechen Wang, Houqiang Li
[pdf]
[DOI]

S2-VER: Semi-Supervised Visual Emotion Recognition
Guoli Jia, Jufeng Yang
[pdf]
[DOI]

Image Coding for Machines with Omnipotent Feature Learning
Ruoyu Feng, Xin Jin, Zongyu Guo, Runsen Feng, Yixin Gao, Tianyu He, Zhizheng Zhang, Simeng Sun, Zhibo Chen
[pdf]
[DOI]

Feature Representation Learning for Unsupervised Cross-Domain Image Retrieval
Conghui Hu, Gim Hee Lee
[pdf]
[DOI]

"Fashionformer: A Simple, Effective and Unified Baseline for Human Fashion Segmentation and Recognition"
Shilin Xu, Xiangtai Li, Jingbo Wang, Guangliang Cheng, Yunhai Tong, Dacheng Tao
[pdf]
[DOI]

Semantic-Guided Multi-Mask Image Harmonization
Xuqian Ren, Yifan Liu
[pdf]
[DOI]

Learning an Isometric Surface Parameterization for Texture Unwrapping
Sagnik Das, Ke Ma, Zhixin Shu, Dimitris Samaras
[pdf]
[DOI]

Towards Regression-Free Neural Networks for Diverse Compute Platforms
Rahul Duggal, Hao Zhou, Shuo Yang, Jun Fang, Yuanjun Xiong, Wei Xia
[pdf]
[DOI]

Relationship Spatialization for Depth Estimation
Xiaoyu Xu, Jiayan Qiu, Xinchao Wang, Zhou Wang
[pdf]
[DOI]

Image2Point: 3D Point-Cloud Understanding with 2D Image Pretrained Models
Chenfeng Xu, Shijia Yang, Tomer Galanti, Bichen Wu, Xiangyu Yue, Bohan Zhai, Wei Zhan, Peter Vajda, Kurt Keutzer, Masayoshi Tomizuka
[pdf]
[DOI]

FAR: Fourier Aerial Video Recognition
Divya Kothandaraman, Tianrui Guan, Xijun Wang, Shuowen Hu, Ming Lin, Dinesh Manocha
[pdf]
[DOI]

Translating a Visual LEGO Manual to a Machine-Executable Plan
Ruocheng Wang, Yunzhi Zhang, Jiayuan Mao, Chin-Yi Cheng, Jiajun Wu
[pdf]
[DOI]

Fabric Material Recovery from Video Using Multi-Scale Geometric Auto-Encoder
Junbang Liang, Ming Lin
[pdf]
[DOI]

MegBA: A GPU-Based Distributed Library for Large-Scale Bundle Adjustment
Jie Ren, Wenteng Liang, Ran Yan, Luo Mai, Shiwen Liu, Xiao Liu
[pdf]
[DOI]

The One Where They Reconstructed 3D Humans and Environments in TV Shows
Georgios Pavlakos, Ethan Weber, Matthew Tancik, Angjoo Kanazawa
[pdf]
[DOI]

TALISMAN: Targeted Active Learning for Object Detection with Rare Classes and Slices Using Submodular Mutual Information
Suraj Kothawade, Saikat Ghosh, Sumit Shekhar, Yu Xiang, Rishabh Iyer
[pdf]
[DOI]

An Efficient Person Clustering Algorithm for Open Checkout-Free Groceries
Junde Wu, Yu Zhang, Rao Fu, Yuanpei Liu, Jing Gao
[pdf]
[DOI]

POP: Mining POtential Performance of New Fashion Products via Webly Cross-Modal Query Expansion
Christian Joppi, Geri Skenderi, Marco Cristani
[pdf]
[DOI]

Pose Forecasting in Industrial Human-Robot Collaboration
Alessio Sampieri, Guido Maria D’Amely di Melendugno, Andrea Avogaro, Federico Cunico, Francesco Setti, Geri Skenderi, Marco Cristani, Fabio Galasso
[pdf]
[DOI]

Actor-Centered Representations for Action Localization in Streaming Videos
Sathyanarayanan Aakur, Sudeep Sarkar
[pdf]
[DOI]

Bandwidth-Aware Adaptive Codec for DNN Inference Offloading in IoT
Xiufeng Xie, Ning Zhou, Wentao Zhu, Ji Liu
[pdf]
[DOI]

Domain Knowledge-Informed Self-Supervised Representations for Workout Form Assessment
Paritosh Parmar, Amol Gharat, Helge Rhodin
[pdf]
[DOI]

Responsive Listening Head Generation: A Benchmark Dataset and Baseline
Mohan Zhou, Yalong Bai, Wei Zhang, Ting Yao, Tiejun Zhao, Tao Mei
[pdf]
[DOI]

"Towards Scale-Aware, Robust, and Generalizable Unsupervised Monocular Depth Estimation by Integrating IMU Motion Dynamics"
Sen Zhang, Jing Zhang, Dacheng Tao
[pdf]
[DOI]

TIPS: Text-Induced Pose Synthesis
Prasun Roy, Subhankar Ghosh, Saumik Bhattacharya, Umapada Pal, Michael Blumenstein
[pdf]
[DOI]

Addressing Heterogeneity in Federated Learning via Distributional Transformation
Haolin Yuan, Bo Hui, Yuchen Yang, Philippe Burlina, Neil Zhenqiang Gong, Yinzhi Cao
[pdf]
[DOI]

Where in the World Is This Image? Transformer-Based Geo-Localization in the Wild
Shraman Pramanick, Ewa M. Nowara, Joshua Gleason, Carlos D. Castillo, Rama Chellappa
[pdf]
[DOI]

Colorization for In Situ Marine Plankton Images
Guannan Guo, Qi Lin, Tao Chen, Zhenghui Feng, Zheng Wang, Jianping Li
[pdf]
[DOI]

Efficient Deep Visual and Inertial Odometry with Adaptive Visual Modality Selection
Mingyu Yang, Yu Chen, Hun-Seok Kim
[pdf]
[DOI]

A Sketch Is Worth a Thousand Words: Image Retrieval with Text and Sketch
Patsorn Sangkloy, Wittawat Jitkrittum, Diyi Yang, James Hays
[pdf]
[DOI]

A Cloud 3D Dataset and Application-Specific Learned Image Compression in Cloud 3D
Tianyi Liu, Sen He, Vinodh Kumaran Jayakumar, Wei Wang
[pdf]
[DOI]

AutoTransition: Learning to Recommend Video Transition Effects
Yaojie Shen, Libo Zhang, Kai Xu, Xiaojie Jin
[pdf]
[DOI]

Online Segmentation of LiDAR Sequences: Dataset and Algorithm
Romain Loiseau, Mathieu Aubry, Loïc Landrieu
[pdf]
[DOI]

Open-World Semantic Segmentation for LIDAR Point Clouds
Jun Cen, Peng Yun, Shiwei Zhang, Junhao Cai, Di Luan, Mingqian Tang, Ming Liu, Michael Yu Wang
[pdf]
[DOI]

KING: Generating Safety-Critical Driving Scenarios for Robust Imitation via Kinematics Gradients
Niklas Hanselmann, Katrin Renz, Kashyap Chitta, Apratim Bhattacharyya, Andreas Geiger
[pdf]
[DOI]

Differentiable Raycasting for Self-Supervised Occupancy Forecasting
Tarasha Khurana, Peiyun Hu, Achal Dave, Jason Ziglar, David Held, Deva Ramanan
[pdf]
[DOI]

InAction: Interpretable Action Decision Making for Autonomous Driving
Taotao Jing, Haifeng Xia, Renran Tian, Haoran Ding, Xiao Luo, Joshua Domeyer, Rini Sherony, Zhengming Ding
[pdf]
[DOI]

CramNet: Camera-Radar Fusion with Ray-Constrained Cross-Attention for Robust 3D Object Detection
Jyh-Jing Hwang, Henrik Kretzschmar, Joshua Manela, Sean Rafferty, Nicholas Armstrong-Crews, Tiffany Chen, Dragomir Anguelov
[pdf]
[DOI]

CODA: A Real-World Road Corner Case Dataset for Object Detection in Autonomous Driving
Kaican Li, Kai Chen, Haoyu Wang, Lanqing Hong, Chaoqiang Ye, Jianhua Han, Yukuai Chen, Wei Zhang, Chunjing Xu, Dit-Yan Yeung, Xiaodan Liang, Zhenguo Li, Hang Xu
[pdf]
[DOI]

Motion Inspired Unsupervised Perception and Prediction in Autonomous Driving
Mahyar Najibi, Jingwei Ji, Yin Zhou, Charles R. Qi, Xinchen Yan, Scott Ettinger, Dragomir Anguelov
[pdf]
[DOI]

StretchBEV: Stretching Future Instance Prediction Spatially and Temporally
Adil Kaan Akan, Fatma Güney
[pdf]
[DOI]

RCLane: Relay Chain Prediction for Lane Detection
Shenghua Xu, Xinyue Cai, Bin Zhao, Li Zhang, Hang Xu, Yanwei Fu, Xiangyang Xue
[pdf]
[DOI]

Drive&Segment: Unsupervised Semantic Segmentation of Urban Scenes via Cross-Modal Distillation
Antonin Vobecky, David Hurych, Oriane Siméoni, Spyros Gidaris, Andrei Bursuc, Patrick Pérez, Josef Sivic
[pdf]
[DOI]

CenterFormer: Center-based Transformer for 3D Object Detection
Zixiang Zhou, Xiangchen Zhao, Yu Wang, Panqu Wang, Hassan Foroosh
[pdf]
[DOI]

Physical Attack on Monocular Depth Estimation with Optimal Adversarial Patches
Zhiyuan Cheng, James Liang, Hongjun Choi, Guanhong Tao, Zhiwen Cao, Dongfang Liu, Xiangyu Zhang
[pdf]
[DOI]

ST-P3: End-to-End Vision-Based Autonomous Driving via Spatial-Temporal Feature Learning
Shengchao Hu, Li Chen, Penghao Wu, Hongyang Li, Junchi Yan, Dacheng Tao
[pdf]
[DOI]

PersFormer: 3D Lane Detection via Perspective Transformer and the OpenLane Benchmark
Li Chen, Chonghao Sima, Yang Li, Zehan Zheng, Jiajie Xu, Xiangwei Geng, Hongyang Li, Conghui He, Jianping Shi, Yu Qiao, Junchi Yan
[pdf]
[DOI]

PointFix: Learning to Fix Domain Bias for Robust Online Stereo Adaptation
Kwonyoung Kim, Jungin Park, Jiyoung Lee, Dongbo Min, Kwanghoon Sohn
[pdf]
[DOI]

BRNet: Exploring Comprehensive Features for Monocular Depth Estimation
Wencheng Han, Junbo Yin, Xiaogang Jin, Xiangdong Dai, Jianbing Shen
[pdf]
[DOI]

SiamDoGe: Domain Generalizable Semantic Segmentation Using Siamese Network
Zhenyao Wu, Xinyi Wu, Xiaoping Zhang, Lili Ju, Song Wang
[pdf]
[DOI]

Context-Aware Streaming Perception in Dynamic Environments
Gur-Eyal Sela, Ionel Gog, Justin Wong, Kumar Krishna Agrawal, Xiangxi Mo, Sukrit Kalra, Peter Schafhalter, Eric Leong, Xin Wang, Bharathan Balaji, Joseph Gonzalez, Ion Stoica
[pdf]
[DOI]

SpOT: Spatiotemporal Modeling for 3D Object Tracking
Colton Stearns, Davis Rempe, Jie Li, Rareș Ambruș, Sergey Zakharov, Vitor Guizilini, Yanchao Yang, Leonidas J. Guibas
[pdf]
[DOI]

Multimodal Transformer for Automatic 3D Annotation and Object Detection
Chang Liu, Xiaoyan Qian, Binxiao Huang, Xiaojuan Qi, Edmund Lam, Siew-Chong Tan, Ngai Wong
[pdf]
[DOI]

Dynamic 3D Scene Analysis by Point Cloud Accumulation
Shengyu Huang, Zan Gojcic, Jiahui Huang, Andreas Wieser, Konrad Schindler
[pdf]
[DOI]

Homogeneous Multi-modal Feature Fusion and Interaction for 3D Object Detection
Xin Li, Botian Shi, Yuenan Hou, Xingjiao Wu, Tianlong Ma, Yikang Li, Liang He
[pdf]
[DOI]

"JPerceiver: Joint Perception Network for Depth, Pose and Layout Estimation in Driving Scenes"
Haimei Zhao, Jing Zhang, Sen Zhang, Dacheng Tao
[pdf]
[DOI]

Semi-Supervised 3D Object Detection with Proficient Teachers
Junbo Yin, Jin Fang, Dingfu Zhou, Liangjun Zhang, Cheng-Zhong Xu, Jianbing Shen, Wenguan Wang
[pdf]
[DOI]

Point Cloud Compression with Sibling Context and Surface Priors
Zhili Chen, Zian Qian, Sukai Wang, Qifeng Chen
[pdf]
[DOI]

Lane Detection Transformer Based on Multi-Frame Horizontal and Vertical Attention and Visual Transformer Module
Han Zhang, Yunchao Gu, Xinliang Wang, Junjun Pan, Minghui Wang
[pdf]
[DOI]

ProposalContrast: Unsupervised Pre-training for LiDAR-Based 3D Object Detection
Junbo Yin, Dingfu Zhou, Liangjun Zhang, Jin Fang, Cheng-Zhong Xu, Jianbing Shen, Wenguan Wang
[pdf]
[DOI]

PreTraM: Self-Supervised Pre-training via Connecting Trajectory and Map
Chenfeng Xu, Tian Li, Chen Tang, Lingfeng Sun, Kurt Keutzer, Masayoshi Tomizuka, Alireza Fathi, Wei Zhan
[pdf]
[DOI]

Master of All: Simultaneous Generalization of Urban-Scene Segmentation to All Adverse Weather Conditions
Nikhil Reddy, Abhinav Singhal, Abhishek Kumar, Mahsa Baktashmotlagh, Chetan Arora
[pdf]
[DOI]

LESS: Label-Efficient Semantic Segmentation for LiDAR Point Clouds
Minghua Liu, Yin Zhou, Charles R. Qi, Boqing Gong, Hao Su, Dragomir Anguelov
[pdf]
[DOI]

Visual Cross-View Metric Localization with Dense Uncertainty Estimates
Zimin Xia, Olaf Booij, Marco Manfredi, Julian F. P. Kooij
[pdf]
[DOI]

V2X-ViT: Vehicle-to-Everything Cooperative Perception with Vision Transformer
Runsheng Xu, Hao Xiang, Zhengzhong Tu, Xin Xia, Ming-Hsuan Yang, Jiaqi Ma
[pdf]
[DOI]

DevNet: Self-Supervised Monocular Depth Learning via Density Volume Construction
Kaichen Zhou, Lanqing Hong, Changhao Chen, Hang Xu, Chaoqiang Ye, Qingyong Hu, Zhenguo Li
[pdf]
[DOI]

Action-Based Contrastive Learning for Trajectory Prediction
Marah Halawa, Olaf Hellwich, Pia Bideau
[pdf]
[DOI]

Radatron: Accurate Detection Using Multi-Resolution Cascaded MIMO Radar
Sohrab Madani, Jayden Guan, Waleed Ahmed, Saurabh Gupta, Haitham Hassanieh
[pdf]
[DOI]

LiDAR Distillation: Bridging the Beam-Induced Domain Gap for 3D Object Detection
Yi Wei, Zibu Wei, Yongming Rao, Jiaxin Li, Jie Zhou, Jiwen Lu
[pdf]
[DOI]

Efficient Point Cloud Segmentation with Geometry-Aware Sparse Networks
Maosheng Ye, Rui Wan, Shuangjie Xu, Tongyi Cao, Qifeng Chen
[pdf]
[DOI]

FH-Net: A Fast Hierarchical Network for Scene Flow Estimation on Real-World Point Clouds
Lihe Ding, Shaocong Dong, Tingfa Xu, Xinli Xu, Jie Wang, Jianan Li
[pdf]
[DOI]

SpatialDETR: Robust Scalable Transformer-Based 3D Object Detection from Multi-View Camera Images with Global Cross-Sensor Attention
Simon Doll, Richard Schulz, Lukas Schneider, Viviane Benzin, Markus Enzweiler, Hendrik P.A. Lensch
[pdf]
[DOI]

Pixel-Wise Energy-Biased Abstention Learning for Anomaly Segmentation on Complex Urban Driving Scenes
Yu Tian, Yuyuan Liu, Guansong Pang, Fengbei Liu, Yuanhong Chen, Gustavo Carneiro
[pdf]
[DOI]

Rethinking Closed-Loop Training for Autonomous Driving
Chris Zhang, Runsheng Guo, Wenyuan Zeng, Yuwen Xiong, Binbin Dai, Rui Hu, Mengye Ren, Raquel Urtasun
[pdf]
[DOI]

SLiDE: Self-Supervised LiDAR De-Snowing through Reconstruction Difficulty
Gwangtak Bae, Byungjun Kim, Seongyong Ahn, Jihong Min, Inwook Shim
[pdf]
[DOI]

Generative Meta-Adversarial Network for Unseen Object Navigation
Sixian Zhang, Weijie Li, Xinhang Song, Yubing Bai, Shuqiang Jiang
[pdf]
[DOI]

Object Manipulation via Visual Target Localization
Kiana Ehsani, Ali Farhadi, Aniruddha Kembhavi, Roozbeh Mottaghi
[pdf]
[DOI]

MoDA: Map Style Transfer for Self-Supervised Domain Adaptation of Embodied Agents
Eun Sun Lee, Junho Kim, SangWon Park, Young Min Kim
[pdf]
[DOI]

Housekeep: Tidying Virtual Households Using Commonsense Reasoning
Yash Kant, Arun Ramachandran, Sriram Yenamandra, Igor Gilitschenski, Dhruv Batra, Andrew Szot, Harsh Agrawal
[pdf]
[DOI]

Domain Randomization-Enhanced Depth Simulation and Restoration for Perceiving and Grasping Specular and Transparent Objects
Qiyu Dai, Jiyao Zhang, Qiwei Li, Tianhao Wu, Hao Dong, Ziyuan Liu, Ping Tan, He Wang
[pdf]
[DOI]

Resolving Copycat Problems in Visual Imitation Learning via Residual Action Prediction
Chia-Chi Chuang, Donglin Yang, Chuan Wen, Yang Gao
[pdf]
[DOI]

OPD: Single-View 3D Openable Part Detection
Hanxiao Jiang, Yongsen Mao, Manolis Savva, Angel X. Chang
[pdf]
[DOI]

AirDet: Few-Shot Detection without Fine-Tuning for Autonomous Exploration
Bowen Li, Chen Wang, Pranay Reddy, Seungchan Kim, Sebastian Scherer
[pdf]
[DOI]

TransGrasp: Grasp Pose Estimation of a Category of Objects by Transferring Grasps from Only One Labeled Instance
Hongtao Wen, Jianhang Yan, Wanli Peng, Yi Sun
[pdf]
[DOI]

StARformer: Transformer with State-Action-Reward Representations for Visual Reinforcement Learning
Jinghuan Shang, Kumara Kahatapitiya, Xiang Li, Michael S. Ryoo
[pdf]
[DOI]

TIDEE: Tidying Up Novel Rooms Using Visuo-Semantic Commonsense Priors
Gabriel Sarch, Zhaoyuan Fang, Adam W. Harley, Paul Schydlo, Michael J. Tarr, Saurabh Gupta, Katerina Fragkiadaki
[pdf]
[DOI]

Learning Efficient Multi-agent Cooperative Visual Exploration
Chao Yu, Xinyi Yang, Jiaxuan Gao, Huazhong Yang, Yu Wang, Yi Wu
[pdf]
[DOI]

Zero-Shot Category-Level Object Pose Estimation
Walter Goodwin, Sagar Vaze, Ioannis Havoutis, Ingmar Posner
[pdf]
[DOI]

Sim-to-Real 6D Object Pose Estimation via Iterative Self-Training for Robotic Bin Picking
Kai Chen, Rui Cao, Stephen James, Yichuan Li, Yun-Hui Liu, Pieter Abbeel, Qi Dou
[pdf]
[DOI]

Active Audio-Visual Separation of Dynamic Sound Sources
Sagnik Majumder, Kristen Grauman
[pdf]
[DOI]

DexMV: Imitation Learning for Dexterous Manipulation from Human Videos
Yuzhe Qin, Yueh-Hua Wu, Shaowei Liu, Hanwen Jiang, Ruihan Yang, Yang Fu, Xiaolong Wang
[pdf]
[DOI]

Sim-2-Sim Transfer for Vision-and-Language Navigation in Continuous Environments
Jacob Krantz, Stefan Lee
[pdf]
[DOI]

Style-Agnostic Reinforcement Learning
Juyong Lee, Seokjun Ahn, Jaesik Park
[pdf]
[DOI]

Self-Supervised Interactive Object Segmentation through a Singulation-and-Grasping Approach
Houjian Yu, Changhyun Choi
[pdf]
[DOI]

Learning from Unlabeled 3D Environments for Vision-and-Language Navigation
Shizhe Chen, Pierre-Louis Guhur, Makarand Tapaswi, Cordelia Schmid, Ivan Laptev
[pdf]
[DOI]

"BodySLAM: Joint Camera Localisation, Mapping, and Human Motion Tracking"
Dorian F. Henning, Tristan Laidlow, Stefan Leutenegger
[pdf]
[DOI]

FusionVAE: A Deep Hierarchical Variational Autoencoder for RGB Image Fusion
Fabian Duffhauss, Ngo Anh Vien, Hanna Ziesche, Gerhard Neumann
[pdf]
[DOI]

Learning Algebraic Representation for Systematic Generalization in Abstract Reasoning
Chi Zhang, Sirui Xie, Baoxiong Jia, Ying Nian Wu, Song-Chun Zhu, Yixin Zhu
[pdf]
[DOI]

Video Dialog As Conversation about Objects Living in Space-Time
Hoang-Anh Pham, Thao Minh Le, Vuong Le, Tu Minh Phuong, Truyen Tran
[pdf]
[DOI]

Quaternion Equivariant Capsule Networks for 3D Point Clouds
Yongheng Zhao, Tolga Birdal, Jan Eric Lenssen, Emanuele Menegatti, Leonidas Guibas, Federico Tombari
[pdf]
[DOI]

DeepFit: 3D Surface Fitting via Neural Network Weighted Least Squares
Yizhak Ben-Shabat, Stephen Gould
[pdf]
[DOI]

NSGANetV2: Evolutionary Multi-Objective Surrogate-Assisted Neural Architecture Search
Zhichao Lu, Kalyanmoy Deb, Erik Goodman, Wolfgang Banzhaf, Vishnu Naresh Boddeti
[pdf]
[DOI]

Describing Textures using Natural Language
Chenyun Wu, Mikayla Timm, Subhransu Maji
[pdf]
[DOI]

Empowering Relational Network by Self-Attention Augmented Conditional Random Fields for Group Activity Recognition
Rizard Renanda Adhi Pramono, Yie Tarng Chen, Wen Hsien Fang
[pdf]
[DOI]

AiR: Attention with Reasoning Capability
Shi Chen, Ming Jiang, Jinhui Yang, Qi Zhao
[pdf]
[DOI]

Self6D: Self-Supervised Monocular 6D Object Pose Estimation
Gu Wang, Fabian Manhardt, Jianzhun Shao, Xiangyang Ji, Nassir Navab , Federico Tombari
[pdf]
[DOI]

Invertible Image Rescaling
Mingqing Xiao, Shuxin Zheng, Chang Liu, Yaolong Wang, Di He, Guolin Ke, Jiang Bian, Zhouchen Lin, Tie-Yan Liu
[pdf]
[DOI]

Synthesize then Compare: Detecting Failures and Anomalies for Semantic Segmentation
Yingda Xia, Yi Zhang, Fengze Liu, Wei Shen, Alan L. Yuille
[pdf]
[DOI]

House-GAN: Relational Generative Adversarial Networks for Graph-constrained House Layout Generation
Nelson Nauata, Kai-Hung Chang, Chin-Yi Cheng, Greg Mori, Yasutaka Furukawa
[pdf]
[DOI]

Crowdsampling the Plenoptic Function
Zhengqi Li, Wenqi Xian, Abe Davis, Noah Snavely
[pdf]
[DOI]

VoxelPose: Towards Multi-Camera 3D Human Pose Estimation in Wild Environment
Hanyue Tu, Chunyu Wang, Wenjun Zeng
[pdf]
[DOI]

End-to-End Object Detection with Transformers
Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, Sergey Zagoruyko
[pdf]
[DOI]

DeepSFM: Structure From Motion Via Deep Bundle Adjustment
Xingkui Wei, Yinda Zhang, Zhuwen Li, Yanwei Fu, Xiangyang Xue
[pdf]
[DOI]

Ladybird: Quasi-Monte Carlo Sampling for Deep Implicit Field Based 3D Reconstruction with Symmetry
Yifan Xu, Tianqi Fan, Yi Yuan, Gurprit Singh
[pdf]
[DOI]

Segment as Points for Efficient Online Multi-Object Tracking and Segmentation
Zhenbo Xu, Wei Zhang, Xiao Tan, Wei Yang, Huan Huang, Shilei Wen, Errui Ding, Liusheng Huang
[pdf]
[DOI]

Conditional Convolutions for Instance Segmentation
Zhi Tian, Chunhua Shen, Hao Chen
[pdf]
[DOI]

MutualNet: Adaptive ConvNet via Mutual Learning from Network Width and Resolution
Taojiannan Yang, Sijie Zhu, Chen Chen, Shen Yan, Mi Zhang, Andrew Willis
[pdf]
[DOI]

Fashionpedia: Ontology, Segmentation, and an Attribute Localization Dataset
Menglin Jia, Mengyun Shi, Mikhail Sirotenko, Yin Cui, Claire Cardie , Bharath Hariharan, Hartwig Adam, Serge Belongie
[pdf]
[DOI]

Privacy Preserving Structure-from-Motion
Marcel Geppert, Viktor Larsson, Pablo Speciale, Johannes L. Schönberger, Marc Pollefeys
[pdf]
[DOI]

Rewriting a Deep Generative Model
David Bau, Steven Liu, Tongzhou Wang, Jun-Yan Zhu, Antonio Torralba
[pdf]
[DOI]

Compare and Reweight: Distinctive Image Captioning Using Similar Images Sets
Jiuniu Wang, Wenjia Xu, Qingzhong Wang, Antoni B. Chan
[pdf]
[DOI]

Long-term Human Motion Prediction with Scene Context
Zhe Cao, Hang Gao, Karttikeya Mangalam, Qi-Zhi Cai, Minh Vo, Jitendra Malik
[pdf]
[DOI]

NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, Ren Ng
[pdf]
[DOI]

ReferIt3D: Neural Listeners for Fine-Grained 3D Object Identification in Real-World Scenes
Panos Achlioptas, Ahmed Abdelreheem, Fei Xia, Mohamed Elhoseiny, Leonidas Guibas
[pdf]
[DOI]

MatryODShka: Real-time 6DoF Video View Synthesis using Multi-Sphere Images
Benjamin Attal, Selena Ling, Aaron Gokaslan, Christian Richardt, James Tompkin
[pdf]
[DOI]

Learning and Aggregating Deep Local Descriptors for Instance-level Recognition
Giorgos Tolias, Tomas Jenicek, Ondřej Chum
[pdf]
[DOI]

A Consistently Fast and Globally Optimal Solution to the Perspective-n-Point Problem
George Terzakis, Manolis Lourakis
[pdf]
[DOI]

Learn to Recover Visible Color for Video Surveillance in a Day
Guangming Wu, Yinqiang Zheng, Zhiling Guo, Zekun Cai, Xiaodan Shi, Xin Ding, Yifei Huang, Yimin Guo, Ryosuke Shibasaki
[pdf]
[DOI]

Deep Fashion3D: A Dataset and Benchmark for 3D Garment Reconstruction from Single Images
Heming Zhu, Yu Cao, Hang Jin, Weikai Chen, Dong Du, Zhangye Wang, Shuguang Cui, Xiaoguang Han
[pdf]
[DOI]

Spatially Adaptive Inference with Stochastic Feature Sampling and Interpolation
Zhenda Xie, Zheng Zhang, Xizhou Zhu, Gao Huang, Stephen Lin
[pdf]
[DOI]

BorderDet: Border Feature for Dense Object Detection
Han Qiu, Yuchen Ma, Zeming Li, Songtao Liu, Jian Sun
[pdf]
[DOI]

Regularization with Latent Space Virtual Adversarial Training
Genki Osada, Budrul Ahsan, Revoti Prasad Bora, Takashi Nishide
[pdf]
[DOI]

Du²Net: Learning Depth Estimation from Dual-Cameras and Dual-Pixels
Yinda Zhang, Neal Wadhwa, Sergio Orts-Escolano, Christian Häne, Sean Fanello, Rahul Garg
[pdf]
[DOI]

Model-Agnostic Boundary-Adversarial Sampling for Test-Time Generalization in Few-Shot learning
Jaekyeom Kim, Hyoungseok Kim, Gunhee Kim
[pdf]
[DOI]

Targeted Attack for Deep Hashing based Retrieval
Jiawang Bai, Bin Chen, Yiming Li, Dongxian Wu, Weiwei Guo, Shu-Tao Xia, En-Hui Yang
[pdf]
[DOI]

Gradient Centralization: A New Optimization Technique for Deep Neural Networks
Hongwei Yong, Jianqiang Huang, Xiansheng Hua, Lei Zhang
[pdf]
[DOI]

Content-Aware Unsupervised Deep Homography Estimation
Jirong Zhang, Chuan Wang, Shuaicheng Liu, Lanpeng Jia, Nianjin Ye, Jue Wang, Ji Zhou, Jian Sun
[pdf]
[DOI]

Multi-View Optimization of Local Feature Geometry
Mihai Dusmanu, Johannes L. Schönberger, Marc Pollefeys
[pdf]
[DOI]

The Phong Surface: Efficient 3D Model Fitting using Lifted Optimization
Jingjing Shen, Thomas J. Cashman, Qi Ye, Tim Hutton, Toby Sharp, Federica Bogo, Andrew Fitzgibbon, Jamie Shotton
[pdf]
[DOI]

Forecasting Human-Object Interaction: Joint Prediction of Motor Attention and Actions in First Person Video
Miao Liu, Siyu Tang, Yin Li, James M. Rehg
[pdf]
[DOI]

Learning Stereo from Single Images
Jamie Watson, Oisin Mac Aodha, Daniyar Turmukhambetov, Gabriel J. Brostow, Michael Firman
[pdf]
[DOI]

Prototype Rectification for Few-Shot Learning
Jinlu Liu, Liang Song, Yongqiang Qin
[pdf]
[DOI]

Learning Feature Descriptors using Camera Pose Supervision
Qianqian Wang, Xiaowei Zhou, Bharath Hariharan, Noah Snavely
[pdf]
[DOI]

Semantic Flow for Fast and Accurate Scene Parsing
Xiangtai Li, Ansheng You, Zhen Zhu, Houlong Zhao, Maoke Yang, Kuiyuan Yang, Shaohua Tan, Yunhai Tong
[pdf]
[DOI]

Appearance Consensus Driven Self-Supervised Human Mesh Recovery
Jogendra Nath Kundu, Mugalodi Rakesh, Varun Jampani, Rahul Mysore Venkatesh, R. Venkatesh Babu
[pdf]
[DOI]

Diffraction Line Imaging
Mark Sheinin, Dinesh N. Reddy, Matthew O’Toole, Srinivasa G. Narasimhan
[pdf]
[DOI]

Aligning and Projecting Images to Class-conditional Generative Networks
Minyoung Huh, Richard Zhang, Jun-Yan Zhu, Sylvain Paris, Aaron Hertzmann
[pdf]
[DOI]

Suppress and Balance: A Simple Gated Network for Salient Object Detection
Xiaoqi Zhao, Youwei Pang, Lihe Zhang, Huchuan Lu, Lei Zhang
[pdf]
[DOI]

Visual Memorability for Robotic Interestingness via Unsupervised Online Learning
Chen Wang, Wenshan Wang, Yuheng Qiu, Yafei Hu, Sebastian Scherer
[pdf]
[DOI]

Post-Training Piecewise Linear Quantization for Deep Neural Networks
Jun Fang, Ali Shafiee, Hamzah Abdel-Aziz, David Thorsley, Georgios Georgiadis, Joseph H. Hassoun
[pdf]
[DOI]

Joint Disentangling and Adaptation for Cross-Domain Person Re-Identification
Yang Zou, Xiaodong Yang, Zhiding Yu, B.V.K. Vijaya Kumar, Jan Kautz
[pdf]
[DOI]

In-Home Daily-Life Captioning Using Radio Signals
Lijie Fan, Tianhong Li, Yuan Yuan, Dina Katabi
[pdf]
[DOI]

Self-Challenging Improves Cross-Domain Generalization
Zeyi Huang, Haohan Wang, Eric P. Xing, Dong Huang
[pdf]
[DOI]

A Competence-aware Curriculum for Visual Concepts Learning via Question Answering
Qing Li, Siyuan Huang, Yining Hong, Song-Chun Zhu
[pdf]
[DOI]

Multitask Learning Strengthens Adversarial Robustness
Chengzhi Mao, Amogh Gupta, Vikram Nitin, Baishakhi Ray, Shuran Song , Junfeng Yang, Carl Vondrick
[pdf]
[DOI]

S2DNAS: Transforming Static CNN Model for Dynamic Inference via Neural Architecture Search
Zhihang Yuan, Bingzhe Wu, Guangyu Sun, Zheng Liang, Shiwan Zhao, Weichen Bi
[pdf]
[DOI]

Improving Deep Video Compression by Resolution-adaptive Flow Coding
Zhihao Hu, Zhenghao Chen, Dong Xu, Guo Lu, Wanli Ouyang, Shuhang Gu
[pdf]
[DOI]

Motion Capture from Internet Videos
Junting Dong, Qing Shuai, Yuanqing Zhang, Xian Liu, Xiaowei Zhou, Hujun Bao
[pdf]
[DOI]

Appearance-Preserving 3D Convolution for Video-based Person Re-identification
Xinqian Gu, Hong Chang, Bingpeng Ma, Hongkai Zhang, Xilin Chen
[pdf]
[DOI]

Solving the Blind Perspective-n-Point Problem End-To-End With Robust Differentiable Geometric Optimization
Dylan Campbell, Liu Liu, Stephen Gould
[pdf]
[DOI]

Exploiting Deep Generative Prior for Versatile Image Restoration and Manipulation
Xingang Pan, Xiaohang Zhan, Bo Dai, Dahua Lin, Chen Change Loy, Ping Luo
[pdf]
[DOI]

Deep Spatial-angular Regularization for Compressive Light Field Reconstruction over Coded Apertures
Mantang Guo, Junhui Hou, Jing Jin, Jie Chen, Lap-Pui Chau
[pdf]
[DOI]

Video-based Remote Physiological Measurement via Cross-verified Feature Disentangling
Xuesong Niu, Zitong Yu, Hu Han, Xiaobai Li, Shiguang Shan, Guoying Zhao
[pdf]
[DOI]

Combining Implicit Function Learning and Parametric Models for 3D Human Reconstruction
Bharat Lal Bhatnagar, Cristian Sminchisescu, Christian Theobalt, Gerard Pons-Moll
[pdf]
[DOI]

Orientation-aware Vehicle Re-identification with Semantics-guided Part Attention Network
Tsai-Shien Chen, Chih-Ting Liu, Chih-Wei Wu, Shao-Yi Chien
[pdf]
[DOI]

Mining Cross-Image Semantics for Weakly Supervised Semantic Segmentation
Guolei Sun, Wenguan Wang, Jifeng Dai, Luc Van Gool
[pdf]
[DOI]

CoReNet: Coherent 3D Scene Reconstruction from a Single RGB Image
Stefan Popov, Pablo Bauszat, Vittorio Ferrari
[pdf]
[DOI]

Layer-wise Conditioning Analysis in Exploring the Learning Dynamics of DNNs
Lei Huang, Jie Qin, Li Liu, Fan Zhu, Ling Shao
[pdf]
[DOI]

RAFT: Recurrent All-Pairs Field Transforms for Optical Flow
Zachary Teed, Jia Deng
[pdf]
[DOI]

Domain-invariant Stereo Matching Networks
Feihu Zhang, Xiaojuan Qi, Ruigang Yang, Victor Prisacariu, Benjamin Wah, Philip Torr
[pdf]
[DOI]

DeepHandMesh: A Weakly-supervised Deep Encoder-Decoder Framework for High-fidelity Hand Mesh Modeling
Gyeongsik Moon, Takaaki Shiratori, Kyoung Mu Lee
[pdf]
[DOI]

Content Adaptive and Error Propagation Aware Deep Video Compression
Guo Lu, Chunlei Cai, Xiaoyun Zhang, Li Chen, Wanli Ouyang, Dong Xu , Zhiyong Gao
[pdf]
[DOI]

Towards Streaming Perception
Mengtian Li, Yu-Xiong Wang, Deva Ramanan
[pdf]
[DOI]

Towards Automated Testing and Robustification by Semantic Adversarial Data Generation
Rakshith Shetty, Mario Fritz, Bernt Schiele
[pdf]
[DOI]

Adversarial Generative Grammars for Human Activity Prediction
AJ Piergiovanni, Anelia Angelova, Alexander Toshev, Michael S. Ryoo
[pdf]
[DOI]

GDumb: A Simple Approach that Questions Our Progress in Continual Learning
Ameya Prabhu, Philip H. S. Torr, Puneet K. Dokania
[pdf]
[DOI]

Learning Lane Graph Representations for Motion Forecasting
Ming Liang, Bin Yang, Rui Hu, Yun Chen, Renjie Liao, Song Feng, Raquel Urtasun
[pdf]
[DOI]

What Matters in Unsupervised Optical Flow
Rico Jonschkowski, Austin Stone, Jonathan T. Barron, Ariel Gordon, Kurt Konolige, Anelia Angelova
[pdf]
[DOI]

Synthesis and Completion of Facades from Satellite Imagery
Xiaowei Zhang, Christopher May, Daniel Aliaga
[pdf]
[DOI]

Mapillary Planet-Scale Depth Dataset
Manuel López Antequera, Pau Gargallo, Markus Hofinger, Samuel Rota Bulò, Yubin Kuang, Peter Kontschieder
[pdf]
[DOI]

V2VNet: Vehicle-to-Vehicle Communication for Joint Perception and Prediction
Tsun-Hsuan Wang, Sivabalan Manivasagam, Ming Liang, Bin Yang, Wenyuan Zeng, Raquel Urtasun
[pdf]
[DOI]

Training Interpretable Convolutional Neural Networks by Differentiating Class-specific Filters
Haoyu Liang, Zhihao Ouyang, Yuyuan Zeng, Hang Su, Zihao He, Shu-Tao Xia, Jun Zhu, Bo Zhang
[pdf]
[DOI]

EagleEye: Fast Sub-net Evaluation for Efficient Neural Network Pruning
Bailin Li, Bowen Wu, Jiang Su, Guangrun Wang
[pdf]
[DOI]

Intrinsic Point Cloud Interpolation via Dual Latent Space Navigation
Marie-Julie Rakotosaona, Maks Ovsjanikov
[pdf]
[DOI]

Cross-Domain Cascaded Deep Translation
Oren Katzir, Dani Lischinski, Daniel Cohen-Or
[pdf]
[DOI]

“Look Ma, no landmarks!” – Unsupervised, Model-based Dense Face Alignment
Tatsuro Koizumi, William A. P. Smith
[pdf]
[DOI]

Online Invariance Selection for Local Feature Descriptors
Rémi Pautrat, Viktor Larsson, Martin R. Oswald, Marc Pollefeys
[pdf]
[DOI]

Rethinking Image Inpainting via a Mutual Encoder-Decoder with Feature Equalizations
Hongyu Liu, Bin Jiang, Yibing Song, Wei Huang, Chao Yang
[pdf]
[DOI]

TextCaps: a Dataset for Image Captioning with Reading Comprehension
Oleksii Sidorov, Ronghang Hu, Marcus Rohrbach, Amanpreet Singh
[pdf]
[DOI]

It is not the Journey but the Destination: Endpoint Conditioned Trajectory Prediction
Karttikeya Mangalam, Harshayu Girase, Shreyas Agarwal, Kuan-Hui Lee, Ehsan Adeli, Jitendra Malik, Adrien Gaidon
[pdf]
[DOI]

Learning What to Learn for Video Object Segmentation
Goutam Bhat, Felix Järemo Lawin, Martin Danelljan, Andreas Robinson, Michael Felsberg, Luc Van Gool, Radu Timofte
[pdf]
[DOI]

SIZER: A Dataset and Model for Parsing 3D Clothing and Learning Size Sensitive 3D Clothing
Garvita Tiwari, Bharat Lal Bhatnagar, Tony Tung, Gerard Pons-Moll
[pdf]
[DOI]

LIMP: Learning Latent Shape Representations with Metric Preservation Priors
Luca Cosmo, Antonio Norelli, Oshri Halimi, Ron Kimmel, Emanuele Rodolà
[pdf]
[DOI]

Unsupervised Sketch to Photo Synthesis
Runtao Liu, Qian Yu, Stella X. Yu
[pdf]
[DOI]

A Simple Way to Make Neural Networks Robust Against Diverse Image Corruptions
Evgenia Rusak, Lukas Schott, Roland S. Zimmermann, Julian Bitterwolf , Oliver Bringmann, Matthias Bethge, Wieland Brendel
[pdf]
[DOI]

SoftPoolNet: Shape Descriptor for Point Cloud Completion and Classification
Yida Wang, David Joseph Tan, Nassir Navab, Federico Tombari
[pdf]
[DOI]

Hierarchical Face Aging through Disentangled Latent Characteristics
Peipei Li, Huaibo Huang, Yibo Hu, Xiang Wu, Ran He, Zhenan Sun
[pdf]
[DOI]

Hybrid Models for Open Set Recognition
Hongjie Zhang, Ang Li, Jie Guo, Yanwen Guo
[pdf]
[DOI]

TopoGAN: A Topology-Aware Generative Adversarial Network
Fan Wang, Huidong Liu, Dimitris Samaras, Chao Chen
[pdf]
[DOI]

Learning to Localize Actions from Moments
Fuchen Long, Ting Yao, Zhaofan Qiu, Xinmei Tian, Jiebo Luo, Tao Mei
[pdf]
[DOI]

ForkGAN: Seeing into the Rainy Night
Ziqiang Zheng, Yang Wu, Xinran Han, Jianbo Shi
[pdf]
[DOI]

TCGM: An Information-Theoretic Framework for Semi-Supervised Multi-Modality Learning
Xinwei Sun, Yilun Xu, Peng Cao, Yuqing Kong, Lingjing Hu, Shanghang Zhang, Yizhou Wang
[pdf]
[DOI]

ExchNet: A Unified Hashing Network for Large-Scale Fine-Grained Image Retrieval
Quan Cui, Qing-Yuan Jiang, Xiu-Shen Wei, Wu-Jun Li, Osamu Yoshie
[pdf]
[DOI]

TSIT: A Simple and Versatile Framework for Image-to-Image Translation
Liming Jiang, Changxu Zhang, Mingyang Huang, Chunxiao Liu, Jianping Shi, Chen Change Loy
[pdf]
[DOI]

ProxyBNN: Learning Binarized Neural Networks via Proxy Matrices
Xiangyu He, Zitao Mo, Ke Cheng, Weixiang Xu, Qinghao Hu, Peisong Wang, Qingshan Liu, Jian Cheng
[pdf]
[DOI]

HMOR: Hierarchical Multi-Person Ordinal Relations for Monocular Multi-Person 3D Pose Estimation
Can Wang, Jiefeng Li, Wentao Liu, Chen Qian, Cewu Lu
[pdf]
[DOI]

Mask2CAD: 3D Shape Prediction by Learning to Segment and Retrieve
Weicheng Kuo, Anelia Angelova, Tsung-Yi Lin, Angela Dai
[pdf]
[DOI]

A Unified Framework of Surrogate Loss by Refactoring and Interpolation
Lanlan Liu, Mingzhe Wang, Jia Deng
[pdf]
[DOI]

Deep Reflectance Volumes: Relightable Reconstructions from Multi-View Photometric Images
Sai Bi, Zexiang Xu, Kalyan Sunkavalli, Miloš Hašan, Yannick Hold-Geoffroy, David Kriegman, Ravi Ramamoorthi
[pdf]
[DOI]

Memory-augmented Dense Predictive Coding for Video Representation Learning
Tengda Han, Weidi Xie, Andrew Zisserman
[pdf]
[DOI]

PointMixup: Augmentation for Point Clouds
Yunlu Chen, Vincent Tao Hu, Efstratios Gavves, Thomas Mensink, Pascal Mettes, Pengwan Yang, Cees G. M. Snoek
[pdf]
[DOI]

Identity-Guided Human Semantic Parsing for Person Re-Identification
Kuan Zhu, Haiyun Guo, Zhiwei Liu, Ming Tang, Jinqiao Wang
[pdf]
[DOI]

Learning Gradient Fields for Shape Generation
Ruojin Cai, Guandao Yang, Hadar Averbuch-Elor, Zekun Hao, Serge Belongie, Noah Snavely, Bharath Hariharan
[pdf]
[DOI]

COCO-FUNIT: Few-Shot Unsupervised Image Translation with a Content Conditioned Style Encoder
Kuniaki Saito, Kate Saenko, Ming-Yu Liu
[pdf]
[DOI]

Corner Proposal Network for Anchor-free, Two-stage Object Detection
Kaiwen Duan, Lingxi Xie, Honggang Qi, Song Bai, Qingming Huang, Qi Tian
[pdf]
[DOI]

PhraseClick: Toward Achieving Flexible Interactive Segmentation by Phrase and Click
Henghui Ding, Scott Cohen, Brian Price, Xudong Jiang
[pdf]
[DOI]

Unified Multisensory Perception: Weakly-Supervised Audio-Visual Video Parsing
Yapeng Tian, Dingzeyu Li, Chenliang Xu
[pdf]
[DOI]

Learning Delicate Local Representations for Multi-Person Pose Estimation
Yuanhao Cai, Zhicheng Wang, Zhengxiong Luo, Binyi Yin, Angang Du, Haoqian Wang, Xiangyu Zhang, Xinyu Zhou, Erjin Zhou, Jian Sun
[pdf]
[DOI]

Learning to Plan with Uncertain Topological Maps
Edward Beeching, Jilles Dibangoye, Olivier Simonin, Christian Wolf
[pdf]
[DOI]

Neural Design Network: Graphic Layout Generation with Constraints
Hsin-Ying Lee, Lu Jiang, Irfan Essa, Phuong B Le, Haifeng Gong, Ming-Hsuan Yang, Weilong Yang
[pdf]
[DOI]

Learning Open Set Network with Discriminative Reciprocal Points
Guangyao Chen, Limeng Qiao, Yemin Shi, Peixi Peng, Jia Li, Tiejun Huang, Shiliang Pu, Yonghong Tian
[pdf]
[DOI]

Convolutional Occupancy Networks
Songyou Peng, Michael Niemeyer, Lars Mescheder, Marc Pollefeys, Andreas Geiger
[pdf]
[DOI]

Multi-person 3D Pose Estimation in Crowded Scenes Based on Multi-View Geometry
He Chen, Pengfei Guo, Pengfei Li, Gim Hee Lee, Gregory Chirikjian
[pdf]
[DOI]

TIDE: A General Toolbox for Identifying Object Detection Errors
Daniel Bolya, Sean Foley, James Hays, Judy Hoffman
[pdf]
[DOI]

PointContrast: Unsupervised Pre-training for 3D Point Cloud Understanding
Saining Xie, Jiatao Gu, Demi Guo, Charles R. Qi, Leonidas Guibas, Or Litany
[pdf]
[DOI]

DSA: More Efficient Budgeted Pruning via Differentiable Sparsity Allocation
Xuefei Ning, Tianchen Zhao, Wenshuo Li, Peng Lei, Yu Wang, Huazhong Yang
[pdf]
[DOI]

Circumventing Outliers of AutoAugment with Knowledge Distillation
Longhui Wei, An Xiao, Lingxi Xie, Xiaopeng Zhang, Xin Chen, Qi Tian
[pdf]
[DOI]

S2DNet: Learning Image Features for Accurate Sparse-to-Dense Matching
Hugo Germain, Guillaume Bourmaud, Vincent Lepetit
[pdf]
[DOI]

RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving
Peixuan Li, Huaici Zhao, Pengfei Liu, Feidao Cao
[pdf]
[DOI]

Video Object Segmentation with Episodic Graph Memory Networks
Xiankai Lu, Wenguan Wang, Martin Danelljan, Tianfei Zhou, Jianbing Shen, Luc Van Gool
[pdf]
[DOI]

Rethinking Bottleneck Structure for Efficient Mobile Network Design
Daquan Zhou, Qibin Hou, Yunpeng Chen, Jiashi Feng, Shuicheng Yan
[pdf]
[DOI]

Side-Tuning: A Baseline for Network Adaptation via Additive Side Networks
Jeffrey O. Zhang, Alexander Sax, Amir Zamir, Leonidas Guibas, Jitendra Malik
[pdf]
[DOI]

Towards Part-aware Monocular 3D Human Pose Estimation: An Architecture Search Approach
Zerui Chen, Yan Huang, Hongyuan Yu, Bin Xue, Ke Han, Yiru Guo, Liang Wang
[pdf]
[DOI]

REVISE: A Tool for Measuring and Mitigating Bias in Visual Datasets
Angelina Wang, Arvind Narayanan, Olga Russakovsky
[pdf]
[DOI]

Contrastive Learning for Weakly Supervised Phrase Grounding
Tanmay Gupta, Arash Vahdat, Gal Chechik, Xiaodong Yang, Jan Kautz, Derek Hoiem
[pdf]
[DOI]

Collaborative Learning of Gesture Recognition and 3D Hand Pose Estimation with Multi-Order Feature Analysis
Siyuan Yang, Jun Liu, Shijian Lu, Meng Hwa Er, Alex C. Kot
[pdf]
[DOI]

Making an Invisibility Cloak: Real World Adversarial Attacks on Object Detectors
Zuxuan Wu, Ser-Nam Lim, Larry S. Davis, Tom Goldstein
[pdf]
[DOI]

TuiGAN: Learning Versatile Image-to-Image Translation with Two Unpaired Images
Jianxin Lin, Yingxue Pang, Yingce Xia, Zhibo Chen, Jiebo Luo
[pdf]
[DOI]

Semi-Siamese Training for Shallow Face Learning
Hang Du, Hailin Shi, Yuchi Liu, Jun Wang, Zhen Lei, Dan Zeng, Tao Mei
[pdf]
[DOI]

GAN Slimming: All-in-One GAN Compression by A Unified Optimization Framework
Haotao Wang, Shupeng Gui, Haichuan Yang, Ji Liu, Zhangyang Wang
[pdf]
[DOI]

Human Interaction Learning on 3D Skeleton Point Clouds for Video Violence Recognition
Yukun Su, Guosheng Lin, Jinhui Zhu, Qingyao Wu
[pdf]
[DOI]

Binarized Neural Network for Single Image Super Resolution
Jingwei Xin, Nannan Wang, Xinrui Jiang, Jie Li, Heng Huang, Xinbo Gao
[pdf]
[DOI]

Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation
Huiyu Wang, Yukun Zhu, Bradley Green, Hartwig Adam, Alan Yuille, Liang-Chieh Chen
[pdf]
[DOI]

Adaptive Computationally Efficient Network for Monocular 3D Hand Pose Estimation
Zhipeng Fan, Jun Liu, Yao Wang
[pdf]
[DOI]

Chained-Tracker: Chaining Paired Attentive Regression Results for End-to-End Joint Multiple-Object Detection and Tracking
Jinlong Peng, Changan Wang, Fangbin Wan, Yang Wu, Yabiao Wang, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang, Yanwei Fu
[pdf]
[DOI]

Distribution-Balanced Loss for Multi-Label Classification in Long-Tailed Datasets
Tong Wu, Qingqiu Huang, Ziwei Liu, Yu Wang, Dahua Lin
[pdf]
[DOI]

Hamiltonian Dynamics for Real-World Shape Interpolation
Marvin Eisenberger, Daniel Cremers
[pdf]
[DOI]

Learning to Scale Multilingual Representations for Vision-Language Tasks
Andrea Burns, Donghyun Kim, Derry Wijaya, Kate Saenko, Bryan A. Plummer
[pdf]
[DOI]

Multi-modal Transformer for Video Retrieval
Valentin Gabeur, Chen Sun, Karteek Alahari, Cordelia Schmid
[pdf]
[DOI]

Feature Representation Matters: End-to-End Learning for Reference-based Image Super-resolution
Yanchun Xie, Jimin Xiao, Mingjie Sun, Chao Yao, Kaizhu Huang
[pdf]
[DOI]

RobustFusion: Human Volumetric Capture with Data-driven Visual Cues using a RGBD Camera
Zhuo Su, Lan Xu, Zerong Zheng, Tao Yu, Yebin Liu, Lu Fang
[pdf]
[DOI]

Surface Normal Estimation of Tilted Images via Spatial Rectifier
Tien Do, Khiem Vuong, Stergios I. Roumeliotis, Hyun Soo Park
[pdf]
[DOI]

Multimodal Shape Completion via Conditional Generative Adversarial Networks
Rundi Wu, Xuelin Chen, Yixin Zhuang, Baoquan Chen
[pdf]
[DOI]

Generative Sparse Detection Networks for 3D Single-shot Object Detection
JunYoung Gwak, Christopher Choy, Silvio Savarese
[pdf]
[DOI]

Grounded Situation Recognition
Sarah Pratt, Mark Yatskar, Luca Weihs, Ali Farhadi, Aniruddha Kembhavi
[pdf]
[DOI]

Learning Modality Interaction for Temporal Sentence Localization and Event Captioning in Videos
Shaoxiang Chen, Wenhao Jiang, Wei Liu, Yu-Gang Jiang
[pdf]
[DOI]

Unpaired Learning of Deep Image Denoising
Xiaohe Wu, Ming Liu, Yue Cao, Dongwei Ren, Wangmeng Zuo
[pdf]
[DOI]

Self-supervising Fine-grained Region Similarities for Large-scale Image Localization
Yixiao Ge, Haibo Wang, Feng Zhu, Rui Zhao, Hongsheng Li
[pdf]
[DOI]

Rotationally-Temporally Consistent Novel View Synthesis of Human Performance Video
Youngjoong Kwon, Stefano Petrangeli, Dahun Kim, Haoliang Wang, Eunbyung Park, Viswanathan Swaminathan, Henry Fuchs
[pdf]
[DOI]

Side-Aware Boundary Localization for More Precise Object Detection
Jiaqi Wang, Wenwei Zhang, Yuhang Cao, Kai Chen, Jiangmiao Pang, Tao Gong, Jianping Shi, Chen Change Loy, Dahua Lin
[pdf]
[DOI]

SF-Net: Single-Frame Supervision for Temporal Action Localization
Fan Ma, Linchao Zhu, Yi Yang, Shengxin Zha, Gourab Kundu, Matt Feiszli, Zheng Shou
[pdf]
[DOI]

Negative Margin Matters: Understanding Margin in Few-shot Classification
Bin Liu, Yue Cao, Yutong Lin, Qi Li, Zheng Zhang, Mingsheng Long, Han Hu
[pdf]
[DOI]

Particularity beyond Commonality: Unpaired Identity Transfer with Multiple References
Ruizheng Wu, Xin Tao, Yingcong Chen, Xiaoyong Shen, Jiaya Jia
[pdf]
[DOI]

Tracking Objects as Points
Xingyi Zhou, Vladlen Koltun, Philipp Krähenbühl
[pdf]
[DOI]

CPGAN: Content-Parsing Generative Adversarial Networks for Text-to-Image Synthesis
Jiadong Liang, Wenjie Pei, Feng Lu
[pdf]
[DOI]

Transporting Labels via Hierarchical Optimal Transport for Semi-Supervised Learning
Fariborz Taherkhani, Ali Dabouei, Sobhan Soleymani, Jeremy Dawson, Nasser M. Nasrabadi
[pdf]
[DOI]

MTI-Net: Multi-Scale Task Interaction Networks for Multi-Task Learning
Simon Vandenhende, Stamatios Georgoulis, Luc Van Gool
[pdf]
[DOI]

Learning to Factorize and Relight a City
Andrew Liu, Shiry Ginosar, Tinghui Zhou, Alexei A. Efros, Noah Snavely
[pdf]
[DOI]

Region Graph Embedding Network for Zero-Shot Learning
Guo-Sen Xie, Li Liu, Fan Zhu, Fang Zhao, Zheng Zhang, Yazhou Yao, Jie Qin, Ling Shao
[pdf]
[DOI]

GRAB: A Dataset of Whole-Body Human Grasping of Objects
Omid Taheri, Nima Ghorbani, Michael J. Black, Dimitrios Tzionas
[pdf]
[DOI]

DEMEA: Deep Mesh Autoencoders for Non-Rigidly Deforming Objects
Edgar Tretschk, Ayush Tewari, Michael Zollhöfer, Vladislav Golyanik, Christian Theobalt
[pdf]
[DOI]

RANSAC-Flow: Generic Two-stage Image Alignment
Xi Shen, François Darmon, Alexei A. Efros, Mathieu Aubry
[pdf]
[DOI]

Semantic Object Prediction and Spatial Sound Super-Resolution with Binaural Sounds
Arun Balajee Vasudevan, Dengxin Dai, Luc Van Gool
[pdf]
[DOI]

Neural Object Learning for 6D Pose Estimation Using a Few Cluttered Images
Kiru Park, Timothy Patten, Markus Vincze
[pdf]
[DOI]

Dense Hybrid Recurrent Multi-view Stereo Net with Dynamic Consistency Checking
Jianfeng Yan, Zizhuang Wei, Hongwei Yi, Mingyu Ding, Runze Zhang, Yisong Chen, Guoping Wang, Yu-Wing Tai
[pdf]
[DOI]

Pixel-Pair Occlusion Relationship Map (P2ORM): Formulation, Inference & Application
Xuchong Qiu, Yang Xiao, Chaohui Wang, Renaud Marlet
[pdf]
[DOI]

MovieNet: A Holistic Dataset for Movie Understanding
Qingqiu Huang, Yu Xiong, Anyi Rao, Jiaze Wang, Dahua Lin
[pdf]
[DOI]

Short-Term and Long-Term Context Aggregation Network for Video Inpainting
Ang Li, Shanshan Zhao, Xingjun Ma, Mingming Gong, Jianzhong Qi, Rui Zhang, Dacheng Tao, Ramamohanarao Kotagiri
[pdf]
[DOI]

DH3D: Deep Hierarchical 3D Descriptors for Robust Large-Scale 6DoF Relocalization
Juan Du, Rui Wang, Daniel Cremers
[pdf]
[DOI]

Face Super-Resolution Guided by 3D Facial Priors
Xiaobin Hu, Wenqi Ren, John LaMaster, Xiaochun Cao, Xiaoming Li, Zechao Li, Bjoern Menze, Wei Liu
[pdf]
[DOI]

Label Propagation with Augmented Anchors: A Simple Semi-Supervised Learning baseline for Unsupervised Domain Adaptation
Yabin Zhang, Bin Deng, Kui Jia, Lei Zhang
[pdf]
[DOI]

Are Labels Necessary for Neural Architecture Search?
Chenxi Liu, Piotr Dollár, Kaiming He, Ross Girshick, Alan Yuille, Saining Xie
[pdf]
[DOI]

BLSM: A Bone-Level Skinned Model of the Human Mesh
Haoyang Wang, Riza Alp Güler, Iasonas Kokkinos, George Papandreou, Stefanos Zafeiriou
[pdf]
[DOI]

Associative Alignment for Few-shot Image Classification
Arman Afrasiyabi, Jean-François Lalonde, Christian Gagné
[pdf]
[DOI]

Cyclic Functional Mapping: Self-supervised Correspondence between Non-isometric Deformable Shapes
Dvir Ginzburg, Dan Raviv
[pdf]
[DOI]

View-Invariant Probabilistic Embedding for Human Pose
Jennifer J. Sun, Jiaping Zhao, Liang-Chieh Chen, Florian Schroff, Hartwig Adam, Ting Liu
[pdf]
[DOI]

Contact and Human Dynamics from Monocular Video
Davis Rempe, Leonidas J. Guibas, Aaron Hertzmann, Bryan Russell, Ruben Villegas, Jimei Yang
[pdf]
[DOI]

PointPWC-Net: Cost Volume on Point Clouds for (Self-)Supervised Scene Flow Estimation
Wenxuan Wu, Zhi Yuan Wang, Zhuwen Li, Wei Liu, Li Fuxin
[pdf]
[DOI]

Points2Surf Learning Implicit Surfaces from Point Clouds
Philipp Erler, Paul Guerrero, Stefan Ohrhallinger, Niloy J. Mitra, Michael Wimmer
[pdf]
[DOI]

Few-Shot Scene-Adaptive Anomaly Detection
Yiwei Lu, Frank Yu, Mahesh Kumar Krishna Reddy, Yang Wang
[pdf]
[DOI]

Personalized Face Modeling for Improved Face Reconstruction and Motion Retargeting
Bindita Chaudhuri, Noranart Vesdapunt, Linda Shapiro, Baoyuan Wang
[pdf]
[DOI]

Entropy Minimisation Framework for Event-based Vision Model Estimation
Urbano Miguel Nunes, Yiannis Demiris
[pdf]
[DOI]

Reconstructing NBA Players
Luyang Zhu, Konstantinos Rematas, Brian Curless, Steven M. Seitz, Ira Kemelmacher-Shlizerman
[pdf]
[DOI]

PIoU Loss: Towards Accurate Oriented Object Detection in Complex Environments
Zhiming Chen, Kean Chen, Weiyao Lin, John See, Hui Yu, Yan Ke, Cong Yang
[pdf]
[DOI]

TENet: Triple Excitation Network for Video Salient Object Detection
Sucheng Ren, Chu Han, Xin Yang, Guoqiang Han, Shengfeng He
[pdf]
[DOI]

Deep Feedback Inverse Problem Solver
Wei-Chiu Ma, Shenlong Wang, Jiayuan Gu, Sivabalan Manivasagam, Antonio Torralba, Raquel Urtasun
[pdf]
[DOI]

Learning From Multiple Experts: Self-paced Knowledge Distillation for Long-tailed Classification
Liuyu Xiang, Guiguang Ding, Jungong Han
[pdf]
[DOI]

Hallucinating Visual Instances in Total Absentia
Jiayan Qiu, Yiding Yang, Xinchao Wang, Dacheng Tao
[pdf]
[DOI]

Weakly-supervised 3D Shape Completion in the Wild
Jiayuan Gu, Wei-Chiu Ma, Sivabalan Manivasagam, Wenyuan Zeng, Zihao Wang, Yuwen Xiong, Hao Su, Raquel Urtasun
[pdf]
[DOI]

DTVNet: Dynamic Time-lapse Video Generation via Single Still Image
Jiangning Zhang, Chao Xu, Liang Liu, Mengmeng Wang, Xia Wu, Yong Liu, Yunliang Jiang
[pdf]
[DOI]

CLIFFNet for Monocular Depth Estimation with Hierarchical Embedding Loss
Lijun Wang, Jianming Zhang, Yifan Wang, Huchuan Lu, Xiang Ruan
[pdf]
[DOI]

Collaborative Video Object Segmentation by Foreground-Background Integration
Zongxin Yang, Yunchao Wei, Yi Yang
[pdf]
[DOI]

Adaptive Margin Diversity Regularizer for handling Data Imbalance in Zero-Shot SBIR
Titir Dutta, Anurag Singh, Soma Biswas
[pdf]
[DOI]

ETH-XGaze: A Large Scale Dataset for Gaze Estimation under Extreme Head Pose and Gaze Variation
Xucong Zhang, Seonwook Park, Thabo Beeler, Derek Bradley, Siyu Tang , Otmar Hilliges
[pdf]
[DOI]

Calibration-free Structure-from-Motion with Calibrated Radial Trifocal Tensors
Viktor Larsson, Nicolas Zobernig, Kasim Taskin, Marc Pollefeys
[pdf]
[DOI]

Occupancy Anticipation for Efficient Exploration and Navigation
Santhosh K. Ramakrishnan, Ziad Al-Halah, Kristen Grauman
[pdf]
[DOI]

Unified Image and Video Saliency Modeling
Richard Droste, Jianbo Jiao, J. Alison Noble
[pdf]
[DOI]

TAO: A Large-Scale Benchmark for Tracking Any Object
Achal Dave, Tarasha Khurana, Pavel Tokmakov, Cordelia Schmid, Deva Ramanan
[pdf]
[DOI]

A Generalization of Otsu’s Method and Minimum Error Thresholding
Jonathan T. Barron
[pdf]
[DOI]

A Cordial Sync: Going Beyond Marginal Policies for Multi-Agent Embodied Tasks
Unnat Jain, Luca Weihs, Eric Kolve, Ali Farhadi, Svetlana Lazebnik, Aniruddha Kembhavi, Alexander Schwing
[pdf]
[DOI]

Big Transfer (BiT): General Visual Representation Learning
Alexander Kolesnikov, Lucas Beyer, Xiaohua Zhai, Joan Puigcerver, Jessica Yung, Sylvain Gelly, Neil Houlsby
[pdf]
[DOI]

VisualCOMET: Reasoning about the Dynamic Context of a Still Image
Jae Sung Park, Chandra Bhagavatula, Roozbeh Mottaghi, Ali Farhadi, Yejin Choi
[pdf]
[DOI]

Few-shot Action Recognition with Permutation-invariant Attention
Hongguang Zhang, Li Zhang, Xiaojuan Qi, Hongdong Li, Philip H. S. Torr, Piotr Koniusz
[pdf]
[DOI]

Character Grounding and Re-Identification in Story of Videos and Text Descriptions
Youngjae Yu, Jongseok Kim, Heeseung Yun, Jiwan Chung, Gunhee Kim
[pdf]
[DOI]

AABO: Adaptive Anchor Box Optimization for Object Detection via Bayesian Sub-sampling
Wenshuo Ma, Tingzhong Tian, Hang Xu, Yimin Huang, Zhenguo Li
[pdf]
[DOI]

Learning Visual Context by Comparison
Minchul Kim, Jongchan Park, Seil Na, Chang Min Park, Donggeun Yoo
[pdf]
[DOI]

Large Scale Holistic Video Understanding
Ali Diba, Mohsen Fayyaz, Vivek Sharma, Manohar Paluri, Jürgen Gall, Rainer Stiefelhagen, Luc Van Gool
[pdf]
[DOI]

Indirect Local Attacks for Context-aware Semantic Segmentation Networks
Krishna Kanth Nakka, Mathieu Salzmann
[pdf]
[DOI]

Predicting Visual Overlap of Images Through Interpretable Non-Metric Box Embeddings
Anita Rau, Guillermo Garcia-Hernando, Danail Stoyanov, Gabriel J. Brostow, Daniyar Turmukhambetov
[pdf]
[DOI]

Connecting Vision and Language with Localized Narratives
Jordi Pont-Tuset, Jasper Uijlings, Soravit Changpinyo, Radu Soricut, Vittorio Ferrari
[pdf]
[DOI]

Adversarial T-shirt! Evading Person Detectors in A Physical World
Kaidi Xu, Gaoyuan Zhang, Sijia Liu, Quanfu Fan, Mengshu Sun, Hongge Chen, Pin-Yu Chen, Yanzhi Wang, Xue Lin
[pdf]
[DOI]

Bounding-box Channels for Visual Relationship Detection
Sho Inayoshi, Keita Otani, Antonio Tejero-de-Pablos, Tatsuya Harada
[pdf]
[DOI]

Minimal Rolling Shutter Absolute Pose with Unknown Focal Length and Radial Distortion
Zuzana Kukelova, Cenek Albl, Akihiro Sugimoto, Konrad Schindler, Tomas Pajdla
[pdf]
[DOI]

SRFlow: Learning the Super-Resolution Space with Normalizing Flow
Andreas Lugmayr, Martin Danelljan, Luc Van Gool, Radu Timofte
[pdf]
[DOI]

DeepGMR: Learning Latent Gaussian Mixture Models for Registration
Wentao Yuan, Benjamin Eckart, Kihwan Kim, Varun Jampani, Dieter Fox , Jan Kautz
[pdf]
[DOI]

Active Perception using Light Curtains for Autonomous Driving
Siddharth Ancha, Yaadhav Raaj, Peiyun Hu, Srinivasa G. Narasimhan, David Held
[pdf]
[DOI]

Invertible Neural BRDF for Object Inverse Rendering
Zhe Chen, Shohei Nobuhara, Ko Nishino
[pdf]
[DOI]

Semi-supervised Semantic Segmentation via Strong-weak Dual-branch Network
Wenfeng Luo, Meng Yang
[pdf]
[DOI]

Practical Deep Raw Image Denoising on Mobile Devices
Yuzhi Wang, Haibin Huang, Qin Xu, Jiaming Liu, Yiqun Liu, Jue Wang
[pdf]
[DOI]

SoundSpaces: Audio-Visual Navigation in 3D Environments
Changan Chen, Unnat Jain, Carl Schissler, Sebastia Vicenc Amengual Gari, Ziad Al-Halah, Vamsi Krishna Ithapu, Philip Robinson, and Kristen Grauman
[pdf]
[DOI]

Two-Stream Consensus Network for Weakly-Supervised Temporal Action Localization
Yuanhao Zhai, Le Wang, Wei Tang, Qilin Zhang, Junsong Yuan, Gang Hua
[pdf]
[DOI]

Erasing Appearance Preservation in Optimization-based Smoothing
Lvmin Zhang, Chengze Li, Yi JI, Chunping Liu, Tien-tsin Wong
[pdf]
[DOI]

Counterfactual Vision-and-Language Navigation via Adversarial Path Sampler
Tsu-Jui Fu, Xin Eric Wang, Matthew F. Peterson,Scott T. Grafton, Miguel P. Eckstein, William Yang Wang
[pdf]
[DOI]

Guided Deep Decoder: Unsupervised Image Pair Fusion
Tatsumi Uezato, Danfeng Hong, Naoto Yokoya, Wei He
[pdf]
[DOI]

Filter Style Transfer between Photos
Jonghwa Yim, Jisung Yoo, Won-joon Do, Beomsu Kim, Jihwan Choe
[pdf]
[DOI]

JGR-P2O: Joint Graph Reasoning based Pixel-to-Offset Prediction Network for 3D Hand Pose Estimation from a Single Depth Image
Linpu Fang, Xingyan Liu, Li Liu, Hang Xu, Wenxiong Kang
[pdf]
[DOI]

Dynamic Group Convolution for Accelerating Convolutional Neural Networks
Zhuo Su, Linpu Fang, Wenxiong Kang, Dewen Hu, Matti Pietikäinen, Li Liu
[pdf]
[DOI]

RD-GAN: Few/Zero-Shot Chinese Character Style Transfer via Radical Decomposition and Rendering
Yaoxiong Huang, Mengchao He, Lianwen Jin, Yongpan Wang
[pdf]
[DOI]

Object-Contextual Representations for Semantic Segmentation
Yuhui Yuan, Xilin Chen, Jingdong Wang
[pdf]
[DOI]

Efficient Spatio-Temporal Recurrent Neural Network for Video Deblurring
Zhihang Zhong, Ye Gao, Yinqiang Zheng, Bo Zheng
[pdf]
[DOI]

Joint Semantic Instance Segmentation on Graphs with the Semantic Mutex Watershed
Steffen Wolf, Yuyan Li, Constantin Pape, Alberto Bailoni, Anna Kreshuk, Fred A. Hamprecht
[pdf]
[DOI]

Photon-Efficient 3D Imaging with A Non-Local Neural Network
Jiayong Peng, Zhiwei Xiong, Xin Huang, Zheng-Ping Li, Dong Liu, Feihu Xu
[pdf]
[DOI]

GeLaTO: Generative Latent Textured Objects
Ricardo Martin-Brualla, Rohit Pandey, Sofien Bouaziz, Matthew Brown, Dan B Goldman
[pdf]
[DOI]

Improving Vision-and-Language Navigation with Image-Text Pairs from the Web
Arjun Majumdar, Ayush Shrivastava, Stefan Lee, Peter Anderson, Devi Parikh, Dhruv Batra
[pdf]
[DOI]

Directional Temporal Modeling for Action Recognition
Xinyu Li, Bing Shuai, Joseph Tighe
[pdf]
[DOI]

Shonan Rotation Averaging: Global Optimality by Surfing SO(p)(n)
Frank Dellaert, David M. Rosen, Jing Wu, Robert Mahony, Luca Carlone
[pdf]
[DOI]

Semantic Curiosity for Active Visual Learning
Devendra Singh Chaplot, Helen Jiang, Saurabh Gupta, Abhinav Gupta
[pdf]
[DOI]

Multi-Temporal Recurrent Neural Networks For Progressive Non-Uniform Single Image Deblurring With Incremental Temporal Training
Dongwon Park, Dong Un Kang, Jisoo Kim, Se Young Chun
[pdf]
[DOI]

ProgressFace: Scale-Aware Progressive Learning for Face Detection
Jiashu Zhu, Dong Li, Tiantian Han, Lu Tian, Yi Shan
[pdf]
[DOI]

Learning Multi-layer Latent Variable Model via Variational Optimization of Short Run MCMC for Approximate Inference
Erik Nijkamp, Bo Pang, Tian Han, Linqi Zhou, Song-Chun Zhu, Ying Nian Wu
[pdf]
[DOI]

CoTeRe-Net: Discovering Collaborative Ternary Relations in Videos
Zhensheng Shi, Cheng Guan, Liangjie Cao, Qianqian Li, Ju Liang, Zhaorui Gu, Haiyong Zheng, Bing Zheng
[pdf]
[DOI]

Modeling the Effects of Windshield Refraction for Camera Calibration
Frank Verbiest, Marc Proesmans, Luc Van Gool
[pdf]
[DOI]

Unsupervised Domain Adaptation for Semantic Segmentation of NIR Images through Generative Latent Search
Prashant Pandey, Aayush Kumar Tyagi, Sameer Ambekar, Prathosh AP
[pdf]
[DOI]

PROFIT: A Novel Training Method for sub-4-bit MobileNet Models
Eunhyeok Park, Sungjoo Yoo
[pdf]
[DOI]

Visual Relation Grounding in Videos
Junbin Xiao, Xindi Shang, Xun Yang, Sheng Tang, Tat-Seng Chua
[pdf]
[DOI]

Weakly Supervised 3D Human Pose and Shape Reconstruction with Normalizing Flows
Andrei Zanfir, Eduard Gabriel Bazavan, Hongyi Xu, William T. Freeman, Rahul Sukthankar, Cristian Sminchisescu
[pdf]
[DOI]

Controlling Style and Semantics in Weakly-Supervised Image Generation
Dario Pavllo, Aurelien Lucchi, Thomas Hofmann
[pdf]
[DOI]

Jointly learning visual motion and confidence from local patches in event cameras
Daniel R. Kepple, Daewon Lee, Colin Prepsius, Volkan Isler, Il Memming Park, Daniel D. Lee
[pdf]
[DOI]

SODA: Story Oriented Dense Video Captioning Evaluation Framework
Soichiro Fujita, Tsutomu Hirao, Hidetaka Kamigaito, Manabu Okumura, Masaaki Nagata
[pdf]
[DOI]

Sketch-Guided Object Localization in Natural Images
Aditay Tripathi, Rajath R. Dani, Anand Mishra and Anirban Chakraborty
[pdf]
[DOI]

A unifying mutual information view of metric learning: cross-entropy vs. pairwise losses
Malik Boudiaf, Jérôme Rony, Imtiaz Masud Ziko, Eric Granger, Marco Pedersoli, Pablo Piantanida, Ismail Ben Ayed
[pdf]
[DOI]

Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models
Jize Cao, Zhe Gan, Yu Cheng, Licheng Yu, Yen-Chun Chen, Jingjing Liu
[pdf]
[DOI]

The Hessian Penalty: A Weak Prior for Unsupervised Disentanglement
William Peebles, John Peebles, Jun-Yan Zhu, Alexei Efros, Antonio Torralba
[pdf]
[DOI]

STAR: Sparse Trained Articulated Human Body Regressor
Ahmed A. A. Osman, Timo Bolkart, Michael J. Black
[pdf]
[DOI]

Optical Flow Distillation: Towards Efficient and Stable Video Style Transfer
Xinghao Chen, Yiman Zhang, Yunhe Wang, Han Shu, Chunjing Xu, Chang Xu
[pdf]
[DOI]

Collaboration by Competition: Self-coordinated Knowledge Amalgamation for Multi-talent Student Learning
Sihui Luo, Wenwen Pan, Xinchao Wang, Dazhou Wang, Haihong Tang, Mingli Song
[pdf]
[DOI]

Do Not Disturb Me: Person Re-identification Under the Interference of Other Pedestrians
Shizhen Zhao, Changxin Gao, Jun Zhang, Hao Cheng, Chuchu Han, Xinyang Jiang, Xiaowei Guo, Wei-Shi Zheng, Nong Sang, Xing Sun
[pdf]
[DOI]

Learning 3D Part Assembly from a Single Image
Yichen Li, Kaichun Mo, Lin Shao, Minhyuk Sung, Leonidas Guibas
[pdf]
[DOI]

PT2PC: Learning to Generate 3D Point Cloud Shapes from Part Tree Conditions
Kaichun Mo, He Wang, Xinchen Yan, Leonidas Guibas
[pdf]
[DOI]

Highly Efficient Salient Object Detection with 100K Parameters
Shang-Hua Gao, Yong-Qiang Tan, Ming-Ming Cheng, Chengze Lu, Yunpeng Chen, Shuicheng Yan
[pdf]
[DOI]

HardGAN: A Haze-Aware Representation Distillation GAN for Single Image Dehazing
Qili Deng, Ziling Huang, Chung-Chi Tsai, Chia-Wen Lin
[pdf]
[DOI]

Lifespan Age Transformation Synthesis
Roy Or-El, Soumyadip Sengupta, Ohad Fried, Eli Shechtman, Ira Kemelmacher-Shlizerman
[pdf]
[DOI]

Domain2Vec: Domain Embedding for Unsupervised Domain Adaptation
Xingchao Peng, Yichen Li, Kate Saenko
[pdf]
[DOI]

Simulating Content Consistent Vehicle Datasets with Attribute Descent
Yue Yao, Liang Zheng, Xiaodong Yang, Milind Naphade, Tom Gedeon
[pdf]
[DOI]

Multiview Detection with Feature Perspective Transformation
Yunzhong Hou, Liang Zheng, Stephen Gould
[pdf]
[DOI]

Learning Object Relation Graph and Tentative Policy for Visual Navigation
Heming Du, Xin Yu, Liang Zheng
[pdf]
[DOI]

Adversarial Self-Supervised Learning for Semi-Supervised 3D Action Recognition
Chenyang Si, Xuecheng Nie, Wei Wang, Liang Wang, Tieniu Tan, Jiashi Feng
[pdf]
[DOI]

Across Scales & Across Dimensions: Temporal Super-Resolution using Deep Internal Learning
Liad Pollak Zuckerman, Eyal Naor, George Pisha, Shai Bagon, Michal Irani
[pdf]
[DOI]

Inducing Optimal Attribute Representations for Conditional GANs
Binod Bhattarai, Tae-Kyun Kim
[pdf]
[DOI]

AR-Net: Adaptive Frame Resolution for Efficient Action Recognition
Yue Meng, Chung-Ching Lin, Rameswar Panda, Prasanna Sattigeri, Leonid Karlinsky, Aude Oliva, Kate Saenko, Rogerio Feris
[pdf]
[DOI]

Image-to-Voxel Model Translation for 3D Scene Reconstruction and Segmentation
Vladimir V. Kniaz, Vladimir A. Knyaz, Fabio Remondino, Artem Bordodymov, Petr Moshkantsev
[pdf]
[DOI]

Consistency Guided Scene Flow Estimation
Yuhua Chen, Luc Van Gool, Cordelia Schmid, Cristian Sminchisescu
[pdf]
[DOI]

Autoregressive Unsupervised Image Segmentation
Yassine Ouali, Céline Hudelot, Myriam Tami
[pdf]
[DOI]

Controllable Image Synthesis via SegVAE
Yen-Chi Cheng, Hsin-Ying Lee, Min Sun, Ming-Hsuan Yang
[pdf]
[DOI]

Off-Policy Reinforcement Learning for Efficient and Effective GAN Architecture Search
Yuan Tian, Qin Wang, Zhiwu Huang, Wen Li, Dengxin Dai, Minghao Yang , Jun Wang, Olga Fink
[pdf]
[DOI]

Efficient Non-Line-of-Sight Imaging from Transient Sinograms
Mariko Isogawa, Dorian Chan, Ye Yuan, Kris Kitani, Matthew O’Toole
[pdf]
[DOI]

Texture Hallucination for Large-Factor Painting Super-Resolution
Yulun Zhang, Zhifei Zhang, Stephen DiVerdi, Zhaowen Wang, Jose Echevarria, Yun Fu
[pdf]
[DOI]

Learning Progressive Joint Propagation for Human Motion Prediction
Yujun Cai, Lin Huang, Yiwei Wang, Tat-Jen Cham, Jianfei Cai, Junsong Yuan, Jun Liu, Xu Yang, Yiheng Zhu, Xiaohui Shen, Ding Liu, Jing Liu, Nadia Magnenat Thalmann
[pdf]
[DOI]

Image Stitching and Rectification for Hand-Held Cameras
Bingbing Zhuang, Quoc-Huy Tran
[pdf]
[DOI]

ParSeNet: A Parametric Surface Fitting Network for 3D Point Clouds
Gopal Sharma, Difan Liu, Subhransu Maji, Evangelos Kalogerakis, Siddhartha Chaudhuri, Radomír Měch
[pdf]
[DOI]

The Group Loss for Deep Metric Learning
Ismail Elezi, Sebastiano Vascon, Alessandro Torcinovich, Marcello Pelillo, Laura Leal-Taixé
[pdf]
[DOI]

Learning Object Depth from Camera Motion and Video Object Segmentation
Brent A. Griffin, Jason J. Corso
[pdf]
[DOI]

OnlineAugment: Online Data Augmentation with Less Domain Knowledge
Zhiqiang Tang, Yunhe Gao, Leonid Karlinsky, Prasanna Sattigeri, Rogerio Feris, Dimitris Metaxas
[pdf]
[DOI]

Learning Pairwise Inter-Plane Relations for Piecewise Planar Reconstruction
Yiming Qian, Yasutaka Furukawa
[pdf]
[DOI]

Intra-class Feature Variation Distillation for Semantic Segmentation
Yukang Wang, Wei Zhou, Tao Jiang, Xiang Bai, Yongchao Xu
[pdf]
[DOI]

Temporal Distinct Representation Learning for Action Recognition
Junwu Weng, Donghao Luo, Yabiao Wang, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang, Xudong Jiang, Junsong Yuan
[pdf]
[DOI]

Representative Graph Neural Network
Changqian Yu, Yifan Liu, Changxin Gao, Chunhua Shen, Nong Sang
[pdf]
[DOI]

Deformation-Aware 3D Model Embedding and Retrieval
Mikaela Angelina Uy, Jingwei Huang, Minhyuk Sung, Tolga Birdal, Leonidas Guibas
[pdf]
[DOI]

Atlas: End-to-End 3D Scene Reconstruction from Posed Images
Zak Murez, Tarrence van As, James Bartolozzi, Ayan Sinha, Vijay Badrinarayanan, Andrew Rabinovich
[pdf]
[DOI]

Multiple Class Novelty Detection Under Data Distribution Shift
Poojan Oza, Hien V. Nguyen, Vishal M. Patel
[pdf]
[DOI]

Colorization of Depth Map via Disentanglement
Chung-Sheng Lai, Zunzhi You, Ching-Chun Huang, Yi-Hsuan Tsai, Wei-Chen Chiu
[pdf]
[DOI]

Beyond Controlled Environments: 3D Camera Re-Localization in Changing Indoor Scenes
Johanna Wald, Torsten Sattler, Stuart Golodetz, Tommaso Cavallari, Federico Tombari
[pdf]
[DOI]

GeoGraph: Graph-based multi-view object detection with geometric cues end-to-end
Ahmed Samy Nassar, Stefano D’Aronco, Sébastien Lefèvre, Jan D. Wegner
[pdf]
[DOI]

Localizing the Common Action Among a Few Videos
Pengwan Yang, Vincent Tao Hu, Pascal Mettes, Cees G. M. Snoek
[pdf]
[DOI]

TAFSSL: Task-Adaptive Feature Sub-Space Learning for few-shot classification
Moshe Lichtenstein, Prasanna Sattigeri, Rogerio Feris, Raja Giryes, Leonid Karlinsky
[pdf]
[DOI]

Traffic Accident Benchmark for Causality Recognition
Tackgeun You, Bohyung Han
[pdf]
[DOI]

Face Anti-Spoofing with Human Material Perception
Zitong Yu, Xiaobai Li, Xuesong Niu, Jingang Shi, Guoying Zhao
[pdf]
[DOI]

How Can I See My Future? FvTraj: Using First-person View for Pedestrian Trajectory Prediction
Huikun Bi, Ruisi Zhang, Tianlu Mao, Zhigang Deng, Zhaoqi Wang
[pdf]
[DOI]

Multiple Expert Brainstorming for Domain Adaptive Person Re-identification
Yunpeng Zhai, Qixiang Ye, Shijian Lu, Mengxi Jia, Rongrong Ji, Yonghong Tian
[pdf]
[DOI]

NASA Neural Articulated Shape Approximation
Boyang Deng, JP Lewis, Timothy Jeruzalski, Gerard Pons-Moll, Geoffrey Hinton, Mohammad Norouzi, Andrea Tagliasacchi
[pdf]
[DOI]

Towards Unique and Informative Captioning of Images
Zeyu Wang, Berthy Feng, Karthik Narasimhan, Olga Russakovsky
[pdf]
[DOI]

When Does Self-supervision Improve Few-shot Learning?
Jong-Chyi Su, Subhransu Maji, Bharath Hariharan
[pdf]
[DOI]

Two-branch Recurrent Network for Isolating Deepfakes in Videos
Iacopo Masi, Aditya Killekar, Royston Marian Mascarenhas, Shenoy Pratik Gurudatt, Wael AbdAlmageed
[pdf]
[DOI]

Incremental Few-Shot Meta-Learning via Indirect Discriminant Alignment
Qing Liu, Orchid Majumder, Alessandro Achille, Avinash Ravichandran, Rahul Bhotika, Stefano Soatto
[pdf]
[DOI]

BigNAS: Scaling Up Neural Architecture Search with Big Single-Stage Models
Jiahui Yu, Pengchong Jin, Hanxiao Liu, Gabriel Bender, Pieter-Jan Kindermans, Mingxing Tan, Thomas Huang, Xiaodan Song, Ruoming Pang, Quoc Le
[pdf]
[DOI]

Differentiable Hierarchical Graph Grouping for Multi-Person Pose Estimation
Sheng Jin, Wentao Liu, Enze Xie, Wenhai Wang, Chen Qian, Wanli Ouyang, Ping Luo
[pdf]
[DOI]

Global Distance-distributions Separation for Unsupervised Person Re-identification
Xin Jin, Cuiling Lan, Wenjun Zeng, Zhibo Chen
[pdf]
[DOI]

I2L-MeshNet: Image-to-Lixel Prediction Network for Accurate 3D Human Pose and Mesh Estimation from a Single RGB Image
Gyeongsik Moon, Kyoung Mu Lee
[pdf]
[DOI]

Pose2Mesh: Graph Convolutional Network for 3D Human Pose and Mesh Recovery from a 2D Human Pose
Hongsuk Choi, Gyeongsik Moon, Kyoung Mu Lee
[pdf]
[DOI]

ALRe: Outlier Detection for Guided Refinement
Mingzhu Zhu, Zhang Gao, Junzhi Yu, Bingwei He, Jiantao Liu
[pdf]
[DOI]

Weakly-Supervised Crowd Counting Learns from Sorting rather than Locations
Yifan Yang, Guorong Li, Zhe Wu, Li Su, Qingming Huang, Nicu Sebe
[pdf]
[DOI]

Unsupervised Domain Attention Adaptation Network for Caricature Attribute Recognition
Wen Ji, Kelei He, Jing Huo, Zheng Gu, Yang Gao
[pdf]
[DOI]

Many-shot from Low-shot: Learning to Annotate using Mixed Supervision for Object Detection
Carlo Biffi, Steven McDonagh, Philip Torr, Aleš Leonardis, Sarah Parisot
[pdf]
[DOI]

Curriculum DeepSDF
Yueqi Duan, Haidong Zhu, He Wang, Li Yi Ram Nevatia, Leonidas J. Guibas
[pdf]
[DOI]

Meshing Point Clouds with Predicted Intrinsic-Extrinsic Ratio Guidance
Minghua Liu, Xiaoshuai Zhang, Hao Su
[pdf]
[DOI]

Improved Adversarial Training via Learned Optimizer
Yuanhao Xiong, Cho-Jui Hsieh
[pdf]
[DOI]

Component Divide-and-Conquer for Real-World Image Super-Resolution
Pengxu Wei, Ziwei Xie, Hannan Lu, Zongyuan Zhan, Qixiang Ye, Wangmeng Zuo, Liang Lin
[pdf]
[DOI]

Enabling Deep Residual Networks for Weakly Supervised Object Detection
Yunhang Shen, Rongrong Ji, Yan Wang, Zhiwei Chen, Feng Zheng, Feiyue Huang, Yunsheng Wu
[pdf]
[DOI]

Deep near-light photometric stereo for spatially varying reflectances
Hiroaki Santo, Michael Waechter, Yasuyuki Matsushita
[pdf]
[DOI]

Learning Visual Representations with Caption Annotations
Mert Bulent Sariyildiz, Julien Perez, Diane Larlus
[pdf]
[DOI]

Solving Long-tailed Recognition with Deep Realistic Taxonomic Classifier
Tz-Ying Wu, Pedro Morgado, Pei Wang, Chih-Hui Ho, Nuno Vasconcelos
[pdf]
[DOI]

Regression of Instance Boundary by Aggregated CNN and GCN
Yanda Meng, Wei Meng, Dongxu Gao, Yitian Zhao, Xiaoyun Yang, Xiaowei Huang, Yalin Zheng
[pdf]
[DOI]

Social Adaptive Module for Weakly-supervised Group Activity Recognition
Rui Yan, Lingxi Xie, Jinhui Tang, Xiangbo Shu, Qi Tian
[pdf]
[DOI]

RGB-D Salient Object Detection with Cross-Modality Modulation and Selection
Chongyi Li, Runmin Cong, Yongri Piao, Qianqian Xu, Chen Change Loy
[pdf]
[DOI]

RetrieveGAN: Image Synthesis via Differentiable Patch Retrieval
Hung-Yu Tseng, Hsin-Ying Lee, Lu Jiang, Ming-Hsuan Yang, Weilong Yang
[pdf]
[DOI]

Cheaper Pre-training Lunch: An Efficient Paradigm for Object Detection
Dongzhan Zhou, Xinchi Zhou, Hongwen Zhang, Shuai Yi, Wanli Ouyang
[pdf]
[DOI]

Faster Person Re-Identification
Guan’an Wang, Shaogang Gong, Jian Cheng, Zengguang Hou
[pdf]
[DOI]

Quantization Guided JPEG Artifact Correction
Max Ehrlich, Ser-Nam Lim, Larry Davis, Abhinav Shrivastava
[pdf]
[DOI]

3PointTM: Faster Measurement of High-Dimensional Transmission Matrices
Yujun Chen, Manoj Kumar Sharma, Ashutosh Sabharwal, Ashok Veeraraghavan, Aswin C. Sankaranarayanan
[pdf]
[DOI]

Joint Bilateral Learning for Real-time Universal Photorealistic Style Transfer
Xide Xia, Meng Zhang, Tianfan Xue, Zheng Sun, Hui Fang, Brian Kulis , Jiawen Chen
[pdf]
[DOI]

Beyond 3DMM Space: Towards Fine-grained 3D Face Reconstruction
Xiangyu Zhu, Fan Yang, Di Huang, Chang Yu, Hao Wang, Jianzhu Guo, Zhen Lei, Stan Z. Li
[pdf]
[DOI]

World-Consistent Video-to-Video Synthesis
Arun Mallya, Ting-Chun Wang, Karan Sapra, Ming-Yu Liu
[pdf]
[DOI]

Commonality-Parsing Network across Shape and Appearance for Partially Supervised Instance Segmentation
Qi Fan, Lei Ke, Wenjie Pei, Chi-Keung Tang, Yu-Wing Tai
[pdf]
[DOI]

GMNet: Graph Matching Network for Large Scale Part Semantic Segmentation in the Wild
Umberto Michieli, Edoardo Borsato, Luca Rossi, Pietro Zanuttigh
[pdf]
[DOI]

Event-based Asynchronous Sparse Convolutional Networks
Nico Messikommer, Daniel Gehrig, Antonio Loquercio, Davide Scaramuzza
[pdf]
[DOI]

AtlantaNet: Inferring the 3D Indoor Layout from a Single 360(∘) Image beyond the Manhattan World Assumption
Giovanni Pintore, Marco Agus, Enrico Gobbetti
[pdf]
[DOI]

AttentionNAS: Spatiotemporal Attention Cell Search for Video Classification
Xiaofang Wang, Xuehan Xiong, Maxim Neumann, AJ Piergiovanni, Michael S. Ryoo, Anelia Angelova, Kris M. Kitani, Wei Hua
[pdf]
[DOI]

REMIND Your Neural Network to Prevent Catastrophic Forgetting
Tyler L. Hayes, Kushal Kafle, Robik Shrestha, Manoj Acharya, Christopher Kanan
[pdf]
[DOI]

Image Classification in the Dark using Quanta Image Sensors
Abhiram Gnanasambandam, Stanley H. Chan
[pdf]
[DOI]

n-Reference Transfer Learning for Saliency Prediction
Yan Luo, Yongkang Wong, Mohan S. Kankanhalli, Qi Zhao
[pdf]
[DOI]

Progressively Guided Alternate Refinement Network for RGB-D Salient Object Detection
Shuhan Chen, Yun Fu
[pdf]
[DOI]

Bottom-Up Temporal Action Localization with Mutual Regularization
Peisen Zhao, Lingxi Xie, Chen Ju, Ya Zhang, Yanfeng Wang, Qi Tian
[pdf]
[DOI]

On Modulating the Gradient for Meta-Learning
Christian Simon, Piotr Koniusz, Richard Nock, Mehrtash Harandi
[pdf]
[DOI]

Domain-Specific Mappings for Generative Adversarial Style Transfer
Hsin-Yu Chang, Zhixiang Wang, Yung-Yu Chuang
[pdf]
[DOI]

DiVA: Diverse Visual Feature Aggregation for Deep Metric Learning
Timo Milbich, Karsten Roth, Homanga Bharadhwaj, Samarth Sinha, Yoshua Bengio, Björn Ommer, Joseph Paul Cohen
[pdf]
[DOI]

DHP: Differentiable Meta Pruning via HyperNetworks
Yawei Li, Shuhang Gu, Kai Zhang, Luc Van Gool, Radu Timofte
[pdf]
[DOI]

Deep Transferring Quantization
Zheng Xie, Zhiquan Wen, Jing Liu, Zhiqiang Liu, Xixian Wu, Mingkui Tan
[pdf]
[DOI]

Deep Credible Metric Learning for Unsupervised Domain Adaptation Person Re-identification
Guangyi Chen, Yuhao Lu, Jiwen Lu, Jie Zhou
[pdf]
[DOI]

Temporal Coherence or Temporal Motion: Which is More Critical for Video-based Person Re-identification?
Guangyi Chen, Yongming Rao, Jiwen Lu, Jie Zhou
[pdf]
[DOI]

Arbitrary-Oriented Object Detection with Circular Smooth Label
Xue Yang, Junchi Yan
[pdf]
[DOI]

Learning Event-Driven Video Deblurring and Interpolation
Songnan Lin, Jiawei Zhang, Jinshan Pan, Zhe Jiang, Dongqing Zou, Yongtian Wang, Jing Chen, Jimmy Ren
[pdf]
[DOI]

Vectorizing World Buildings: Planar Graph Reconstruction by Primitive Detection and Relationship Inference
Nelson Nauata, Yasutaka Furukawa
[pdf]
[DOI]

Learning to Combine: Knowledge Aggregation for Multi-Source Domain Adaptation
Hang Wang, Minghao Xu, Bingbing Ni, Wenjun Zhang
[pdf]
[DOI]

CSCL: Critical Semantic-Consistent Learning for Unsupervised Domain Adaptation
Jiahua Dong, Yang Cong, Gan Sun, Yuyang Liu, Xiaowei Xu
[pdf]
[DOI]

Prototype Mixture Models for Few-shot Semantic Segmentation
Boyu Yang, Chang Liu, Bohao Li, Jianbin Jiao, Qixiang Ye
[pdf]
[DOI]

Webly Supervised Image Classification with Self-Contained Confidence
Jingkang Yang, Litong Feng, Weirong Chen, Xiaopeng Yan, Huabin Zheng , Ping Luo, Wayne Zhang
[pdf]
[DOI]

Search What You Want: Barrier Panelty NAS for Mixed Precision Quantization
Haibao Yu, Qi Han, Jianbo Li, Jianping Shi, Guangliang Cheng, Bin Fan
[pdf]
[DOI]

Monocular 3D Object Detection via Feature Domain Adaptation
Xiaoqing Ye, Liang Du, Yifeng Shi, Yingying Li, Xiao Tan, Jianfeng Feng, Errui Ding, Shilei Wen
[pdf]
[DOI]

Talking-head Generation with Rhythmic Head Motion
Lele Chen, Guofeng Cui, Celong Liu, Zhong Li, Ziyi Kou, Yi Xu, Chenliang Xu
[pdf]
[DOI]

AUTO3D: Novel view synthesis through unsupervisely learned variational viewpoint and global 3D representation
Xiaofeng Liu, Tong Che, Yiqun Lu, Chao Yang, Site Li, Jane You
[pdf]
[DOI]

VPN: Learning Video-Pose Embedding for Activities of Daily Living
Srijan Das, Saurav Sharma, Rui Dai, François Brémond, Monique Thonnat
[pdf]
[DOI]

Soft Anchor-Point Object Detection
Chenchen Zhu, Fangyi Chen, Zhiqiang Shen, Marios Savvides
[pdf]
[DOI]

Beyond Fixed Grid: Learning Geometric Image Representation with a Deformable Grid
Jun Gao, Zian Wang, Jinchen Xuan, Sanja Fidler
[pdf]
[DOI]

Soft Expert Reward Learning for Vision-and-Language Navigation
Hu Wang, Qi Wu, Chunhua Shen
[pdf]
[DOI]

Part-aware Prototype Network for Few-shot Semantic Segmentation
Yongfei Liu, Xiangyi Zhang, Songyang Zhang, Xuming He
[pdf]
[DOI]

Learning from Extrinsic and Intrinsic Supervisions for Domain Generalization
Shujun Wang, Lequan Yu, Caizi Li, Chi-Wing Fu, Pheng-Ann Heng
[pdf]
[DOI]

Joint Learning of Social Groups, Individuals Action and Sub-group Activities in Videos
Mahsa Ehsanpour, Alireza Abedin, Fatemeh Saleh, Javen Shi, Ian Reid , Hamid Rezatofighi
[pdf]
[DOI]

Whole-Body Human Pose Estimation in the Wild
Sheng Jin, Lumin Xu, Jin Xu, Can Wang, Wentao Liu, Chen Qian, Wanli Ouyang, Ping Luo
[pdf]
[DOI]

Relative Pose Estimation of Calibrated Cameras with Known SE(3) Invariants
Bo Li, Evgeniy Martyushev, Gim Hee Lee
[pdf]
[DOI]

Sequential Convolution and Runge-Kutta Residual Architecture for Image Compressed Sensing
Runkai Zheng, Yinqi Zhang, Daolang Huang, Qingliang Chen
[pdf]
[DOI]

Deep Hough Transform for Semantic Line Detection
Qi Han, Kai Zhao, Jun Xu, Ming-Ming Cheng
[pdf]
[DOI]

Structured Landmark Detection via Topology-Adapting Deep Graph Learning
Weijian Li, Yuhang Lu, Kang Zheng, Haofu Liao, Chihung Lin, Jiebo Luo, Chi-Tung Cheng, Jing Xiao, Le Lu, Chang-Fu Kuo, Shun Miao
[pdf]
[DOI]

3D Human Shape and Pose from a Single Low-Resolution Image with Self-Supervised Learning
Xiangyu Xu, Hao Chen, Francesc Moreno-Noguer, László A. Jeni, Fernando De la Torre
[pdf]
[DOI]

Learning to Balance Specificity and Invariance for In and Out of Domain Generalization
Prithvijit Chattopadhyay, Yogesh Balaji, Judy Hoffman
[pdf]
[DOI]

Contrastive Learning for Unpaired Image-to-Image Translation
Taesung Park Alexei A. Efros Richard Zhang Jun-Yan Zhu
[pdf]
[DOI]

DLow: Diversifying Latent Flows for Diverse Human Motion Prediction
Ye Yuan, Kris Kitani
[pdf]
[DOI]

GRNet: Gridding Residual Network for Dense Point Cloud Completion
Haozhe Xie, Hongxun Yao, Shangchen Zhou, Jiageng Mao, Shengping Zhang, Wenxiu Sun
[pdf]
[DOI]

Gait Lateral Network: Learning Discriminative and Compact Representations for Gait Recognition
Saihui Hou, Chunshui Cao, Xu Liu, Yongzhen Huang
[pdf]
[DOI]

Blind Face Restoration via Deep Multi-scale Component Dictionaries
Xiaoming Li, Chaofeng Chen, Shangchen Zhou, Xianhui Lin, Wangmeng Zuo, Lei Zhang
[pdf]
[DOI]

Robust Neural Networks inspired by Strong Stability Preserving Runge-Kutta methods
Byungjoo Kim, Bryce Chudomelka, Jinyoung Park, Jaewoo Kang, Youngjoon Hong, Hyunwoo J. Kim
[pdf]
[DOI]

Inequality-Constrained and Robust 3D Face Model Fitting
Evangelos Sariyanidi, Casey J. Zampella, Robert T. Schultz, Birkan Tunc
[pdf]
[DOI]

Gabor Layers Enhance Network Robustness
Juan C. Pérez, Motasem Alfarra, Guillaume Jeanneret, Adel Bibi, Ali Thabet, Bernard Ghanem, Pablo Arbeláez
[pdf]
[DOI]

Conditional Image Repainting via Semantic Bridge and Piecewise Value Function
Shuchen Weng, Wenbo Li, Dawei Li, Hongxia Jin, Boxin Shi
[pdf]
[DOI]

Learnable Cost Volume Using the Cayley Representation
Taihong Xiao, Jinwei Yuan, Deqing Sun, Qifei Wang Xin-Yu Zhang, Kehan Xu, Ming-Hsuan Yang
[pdf]
[DOI]

HALO: Hardware-Aware Learning to Optimize
Chaojian Li, Tianlong Chen, Haoran You, Zhangyang Wang, Yingyan Lin
[pdf]
[DOI]

Structured3D: A Large Photo-realistic Dataset for Structured 3D Modeling
Jia Zheng, Junfei Zhang, Jing Li, Rui Tang, Shenghua Gao, Zihan Zhou
[pdf]
[DOI]

BroadFace: Looking at Tens of Thousands of People at Once for Face Recognition
Yonghyun Kim, Wonpyo Park, Jongju Shin
[pdf]
[DOI]

Interpretable Visual Reasoning via Probabilistic Formulation under Natural Supervision
Xinzhe Han, Shuhui Wang, Chi Su, Weigang Zhang, Qingming Huang, Qi Tian
[pdf]
[DOI]

Domain Adaptive Semantic Segmentation Using Weak Labels
Sujoy Paul, Yi-Hsuan Tsai, Samuel Schulter, Amit K. Roy-Chowdhury, Manmohan Chandraker
[pdf]
[DOI]

Knowledge Distillation Meets Self-Supervision
Guodong Xu, Ziwei Liu, Xiaoxiao Li, Chen Change Loy
[pdf]
[DOI]

Efficient Neighbourhood Consensus Networks via Submanifold Sparse Convolutions
Ignacio Rocco, Relja Arandjelović, Josef Sivic
[pdf]
[DOI]

Reconstructing the Noise Variance Manifold for Image Denoising
Ioannis Marras, Grigorios G. Chrysos, Ioannis Alexiou, Gregory Slabaugh, Stefanos Zafeiriou
[pdf]
[DOI]

Occlusion-Aware Depth Estimation with Adaptive Normal Constraints
Xiaoxiao Long, Lingjie Liu, Christian Theobalt, Wenping Wang
[pdf]
[DOI]

VisualEchoes: Spatial Image Representation Learning through Echolocation
Ruohan Gao, Changan Chen, Ziad Al-Halah, Carl Schissler, Kristen Grauman
[pdf]
[DOI]

Smooth-AP: Smoothing the Path Towards Large-Scale Image Retrieval
Andrew Brown, Weidi Xie, Vicky Kalogeiton, Andrew Zisserman
[pdf]
[DOI]

Naive-Student: Leveraging Semi-Supervised Learning in Video Sequences for Urban Scene Segmentation
Liang-Chieh Chen, Raphael Gontijo Lopes, Bowen Cheng, Maxwell D. Collins, Ekin D. Cubuk, Barret Zoph, Hartwig Adam, Jonathon Shlens
[pdf]
[DOI]

Spatially Aware Multimodal Transformers for TextVQA
Yash Kant, Dhruv Batra, Peter Anderson, Alexander Schwing, Devi Parikh, Jiasen Lu, Harsh Agrawal
[pdf]
[DOI]

Every Pixel Matters: Center-aware Feature Alignment for Domain Adaptive Object Detector
Cheng-Chun Hsu, Yi-Hsuan Tsai, Yen-Yu Lin, Ming-Hsuan Yang
[pdf]
[DOI]

URIE: Universal Image Enhancement for Visual Recognition in the Wild
Taeyoung Son Juwon Kang Namyup Kim Sunghyun Cho Suha Kwak
[pdf]
[DOI]

Pyramid Multi-view Stereo Net with Self-adaptive View Aggregation
Hongwei Yi, Zizhuang Wei, Mingyu Ding, Runze Zhang, Yisong Chen, Guoping Wang, Yu-Wing Tai
[pdf]
[DOI]

SPL-MLL: Selecting Predictable Landmarks for Multi-Label Learning
Junbing Li, Changqing Zhang, Pengfei Zhu, Baoyuan Wu, Lei Chen, Qinghua Hu
[pdf]
[DOI]

Unpaired Image-to-Image Translation using Adversarial Consistency Loss
Yihao Zhao, Ruihai Wu, Hao Dong
[pdf]
[DOI]

Discriminability Distillation in Group Representation Learning
Manyuan Zhang, Guanglu Song, Hang Zhou, Yu Liu
[pdf]
[DOI]

Monocular Expressive Body Regression through Body-Driven Attention
Vasileios Choutas, Georgios Pavlakos, Timo Bolkart, Dimitrios Tzionas , Michael J. Black
[pdf]
[DOI]

Dual Adversarial Network: Toward Real-world Noise Removal and Noise Generation
Zongsheng Yue, Qian Zhao, Lei Zhang, Deyu Meng
[pdf]
[DOI]

Linguistic Structure Guided Context Modeling for Referring Image Segmentation
Tianrui Hui, Si Liu, Shaofei Huang, Guanbin Li, Sansi Yu, Faxi Zhang, Jizhong Han
[pdf]
[DOI]

Federated Visual Classification with Real-World Data Distribution
Tzu-Ming Harry Hsu, Hang Qi, Matthew Brown
[pdf]
[DOI]

Robust Re-Identification by Multiple Views Knowledge Distillation
Angelo Porrello, Luca Bergamini, Simone Calderara
[pdf]
[DOI]

Defocus Deblurring Using Dual-Pixel Data
Abdullah Abuolaim, Michael S. Brown
[pdf]
[DOI]

RhyRNN: Rhythmic RNN for Recognizing Events in Long and Complex Videos
Tianshu Yu, Yikang Li, Baoxin Li
[pdf]
[DOI]

Take an Emotion Walk: Perceiving Emotions from Gaits Using Hierarchical Attention Pooling and Affective Mapping
Uttaran Bhattacharya, Christian Roncal, Trisha Mittal, Rohan Chandra , Kyra Kapsaskis, Kurt Gray, Aniket Bera, Dinesh Manocha
[pdf]
[DOI]

Weighing Counts: Sequential Crowd Counting by Reinforcement Learning
Liang Liu, Hao Lu, Hongwei Zou, Haipeng Xiong, Zhiguo Cao, Chunhua Shen
[pdf]
[DOI]

Reflection Backdoor: A Natural Backdoor Attack on Deep Neural Networks
Yunfei Liu, Xingjun Ma, James Bailey, Feng Lu
[pdf]
[DOI]

Learning to Learn with Variational Information Bottleneck for Domain Generalization
Yingjun Du, Jun Xu, Huan Xiong, Qiang Qiu, Xiantong Zhen, Cees G. M. Snoek, Ling Shao
[pdf]
[DOI]

Deep Positional and Relational Feature Learning for Rotation-Invariant Point Cloud Analysis
Ruixuan Yu, Xin Wei, Federico Tombari, Jian Sun
[pdf]
[DOI]

Thanks for Nothing: Predicting Zero-Valued Activations with Lightweight Convolutional Neural Networks
Gil Shomron, Ron Banner, Moran Shkolnik, Uri Weiser
[pdf]
[DOI]

Layered Neighborhood Expansion for Incremental Multiple Graph Matching
Zixuan Chen, Zhihui Xie, Junchi Yan Yinqiang Zheng, Xiaokang Yang
[pdf]
[DOI]

SCAN: Learning to Classify Images without Labels
Wouter Van Gansbeke, Simon Vandenhende, Stamatios Georgoulis, Marc Proesmans, Luc Van Gool
[pdf]
[DOI]

Graph convolutional networks for learning with few clean and many noisy labels
Ahmet Iscen, Giorgos Tolias, Yannis Avrithis, Ondřej Chum, Cordelia Schmid
[pdf]
[DOI]

Object-and-Action Aware Model for Visual Language Navigation
Yuankai Qi, Zizheng Pan, Shengping Zhang, Anton van den Hengel, Qi Wu
[pdf]
[DOI]

A Comprehensive Study of Weight Sharing in Graph Networks for 3D Human Pose Estimation
Kenkun Liu, Rongqi Ding, Zhiming Zou, Le Wang, Wei Tang
[pdf]
[DOI]

MuCAN: Multi-Correspondence Aggregation Network for Video Super-Resolution
Wenbo Li, Xin Tao, Taian Guo, Lu Qi, Jiangbo Lu, Jiaya Jia
[pdf]
[DOI]

Efficient Semantic Video Segmentation with Per-frame Inference
Yifan Liu, Chunhua Shen, Changqian Yu, Jingdong Wang
[pdf]
[DOI]

Increasing the Robustness of Semantic Segmentation Models with Painting-by-Numbers
Christoph Kamann, Carsten Rother
[pdf]
[DOI]

Deep Spiking Neural Network: Energy Efficiency Through Time based Coding
Bing Han, Kaushik Roy
[pdf]
[DOI]

InfoFocus: 3D Object Detection for Autonomous Driving with Dynamic Information Modeling
Jun Wang, Shiyi Lan, Mingfei Gao, Larry S. Davis
[pdf]
[DOI]

Utilizing Patch-level Category Activation Patterns for Multiple Class Novelty Detection
Poojan Oza, Vishal M. Patel
[pdf]
[DOI]

People as Scene Probes
Yifan Wang, Brian L. Curless, Steven M. Seitz
[pdf]
[DOI]

Mapping in a Cycle: Sinkhorn Regularized Unsupervised Learning for Point Cloud Shapes
Lei Yang, Wenxi Liu, Zhiming Cui, Nenglun Chen, Wenping Wang
[pdf]
[DOI]

Label-Efficient Learning on Point Clouds using Approximate Convex Decompositions
Matheus Gadelha, Aruni RoyChowdhury, Gopal Sharma, Evangelos Kalogerakis, Liangliang Cao, Erik Learned-Miller, Rui Wang, Subhransu Maji
[pdf]
[DOI]

TexMesh: Reconstructing Detailed Human Texture and Geometry from RGB-D Video
Tiancheng Zhi, Christoph Lassner, Tony Tung, Carsten Stoll, Srinivasa G. Narasimhan, Minh Vo
[pdf]
[DOI]

Consistency-based Semi-supervised Active Learning: Towards Minimizing Labeling Cost
Mingfei Gao, Zizhao Zhang, Guo Yu, Sercan . Arık, Larry S. Davis, Tomas Pfister
[pdf]
[DOI]

Point-Set Anchors for Object Detection, Instance Segmentation and Pose Estimation
Fangyun Wei, Xiao Sun, Hongyang Li, Jingdong Wang, Stephen Lin
[pdf]
[DOI]

Modeling 3D Shapes by Reinforcement Learning
Cheng Lin, Tingxiang Fan, Wenping Wang, Matthias Nießner
[pdf]
[DOI]

LST-Net: Learning a Convolutional Neural Network with a Learnable Sparse Transform
Lida Li, Kun Wang, Shuai Li, Xiangchu Feng, Lei Zhang
[pdf]
[DOI]

Learning What Makes a Difference from Counterfactual Examples and Gradient Supervision
Damien Teney, Ehsan Abbasnedjad, Anton van den Hengel
[pdf]
[DOI]

CN: Channel Normalization For Point Cloud Recognition
Zetong Yang, Yanan Sun, Shu Liu, Xiaojuan Qi, Jiaya Jia
[pdf]
[DOI]

Rethinking the Defocus Blur Detection Problem and A Real-Time Deep DBD Model
Ning Zhang, Junchi Yan
[pdf]
[DOI]

AutoMix: Mixup Networks for Sample Interpolation via Cooperative Barycenter Learning
Jianchao Zhu, Liangliang Shi, Junchi Yan, Hongyuan Zha
[pdf]
[DOI]

Scene Text Image Super-resolution in the wild
Wenjia Wang, Enze Xie, Xuebo Liu, Wenhai Wang, Ding Liang, Chunhua Shen, Xiang Bai
[pdf]
[DOI]

Coupling Explicit and Implicit Surface Representations for Generative 3D Modeling
Omid Poursaeed, Matthew Fisher, Noam Aigerman, Vladimir G. Kim
[pdf]
[DOI]

Learning Disentangled Representations with Latent Variation Predictability
Xinqi Zhu, Chang Xu, Dacheng Tao
[pdf]
[DOI]

Deep Space-Time Video Upsampling Networks
Jaeyeon Kang, Younghyun Jo, Seoung Wug Oh, Peter Vajda, Seon Joo Kim
[pdf]
[DOI]

Large-Scale Few-Shot Learning via Multi-Modal Knowledge Discovery
Shuo Wang, Jun Yue, Jianzhuang Liu, Qi Tian, Meng Wang
[pdf]
[DOI]

Fast Video Object Segmentation using the Global Context Module
Yu Li, Zhuoran Shen, Ying Shan
[pdf]
[DOI]

Uncertainty-Aware Weakly Supervised Action Detection from Untrimmed Videos
Anurag Arnab, Chen Sun, Arsha Nagrani, Cordelia Schmid
[pdf]
[DOI]

Selecting Relevant Features from a Multi-domain Representation for Few-shot Classification
Nikita Dvornik, Cordelia Schmid, Julien Mairal
[pdf]
[DOI]

MessyTable: Instance Association in Multiple Camera Views
Zhongang Cai, Junzhe Zhang, Daxuan Ren, Cunjun Yu, Haiyu Zhao, Shuai Yi, Chai Kiat Yeo, Chen Change Loy
[pdf]
[DOI]

A Unified Framework for Shot Type Classification Based on Subject Centric Lens
Anyi Rao, Jiaze Wang, Linning Xu, Xuekun Jiang, Qingqiu Huang, Bolei Zhou, Dahua Lin
[pdf]
[DOI]

BSL-1K: Scaling up co-articulated sign language recognition using mouthing cues
Samuel Albanie, Gül Varol, Liliane Momeni, Triantafyllos Afouras, Joon Son Chung, Neil Fox, Andrew Zisserman
[pdf]
[DOI]

HTML: A Parametric Hand Texture Model for 3D Hand Reconstruction and Personalization
Neng Qian, Jiayi Wang, Franziska Mueller, Florian Bernard, Vladislav Golyanik, Christian Theobalt
[pdf]
[DOI]

CycAs: Self-supervised Cycle Association for Learning Re-identifiable Descriptions
Zhongdao Wang, Jingwei Zhang, Liang Zheng, Yixuan Liu, Yifan Sun, Yali Li, Shengjin Wang
[pdf]
[DOI]

Open-Edit: Open-Domain Image Manipulation with Open-Vocabulary Instructions
Xihui Liu, Zhe Lin, Jianming Zhang, Handong Zhao, Quan Tran, Xiaogang Wang, Hongsheng Li
[pdf]
[DOI]

Towards Real-Time Multi-Object Tracking
Zhongdao Wang, Liang Zheng, Yixuan Liu, Yali Li, Shengjin Wang
[pdf]
[DOI]

A Balanced and Uncertainty-aware Approach for Partial Domain Adaptation
Jian Liang, Yunbo Wang, Dapeng Hu, Ran He, Jiashi Feng
[pdf]
[DOI]

Unsupervised Deep Metric Learning with Transformed Attention Consistency and Contrastive Clustering Loss
Yang Li, Shichao Kan, Zhihai He
[pdf]
[DOI]

STEm-Seg: Spatio-temporal Embeddings for Instance Segmentation in Videos
Ali Athar, Sabarinath Mahadevan, Aljosa Osep, Laura Leal-Taixé, Bastian Leibe
[pdf]
[DOI]

Hierarchical Style-based Networks for Motion Synthesis
Jingwei Xu, Huazhe Xu, Bingbing Ni, Xiaokang Yang, Xiaolong Wang, Trevor Darrell
[pdf]
[DOI]

Who Left the Dogs Out? 3D Animal Reconstruction with Expectation Maximization in the Loop
Benjamin Biggs, Oliver Boyne, James Charles, Andrew Fitzgibbon, Roberto Cipolla
[pdf]
[DOI]

Learning to Count in the Crowd from Limited Labeled Data
Vishwanath A. Sindagi, Rajeev Yasarla, Deepak Sam Babu, R. Venkatesh Babu, Vishal M. Patel
[pdf]
[DOI]

SPOT: Selective Point Cloud Voting for Better Proposal in Point Cloud Object Detection
Hongyuan Du, Linjun Li, Bo Liu, Nuno Vasconcelos
[pdf]
[DOI]

Explainable Face Recognition
Jonathan R. Williford, Brandon B. May, Jeffrey Byrne
[pdf]
[DOI]

From Shadow Segmentation to Shadow Removal
Hieu Le, Dimitris Samaras
[pdf]
[DOI]

Diverse and Admissible Trajectory Prediction through Multimodal Context Understanding
Seong Hyeon Park, Gyubok Lee, Jimin Seo, Manoj Bhat, Minseok Kang, Jonathan Francis, Ashwin Jadhav, Paul Pu Liang, Louis-Philippe Morency
[pdf]
[DOI]

CONFIG: Controllable Neural Face Image Generation
Marek Kowalski, Stephan J. Garbin, Virginia Estellers, Tadas Baltrušaitis, Matthew Johnson, Jamie Shotton
[pdf]
[DOI]

Single View Metrology in the Wild
Rui Zhu, Xingyi Yang, Yannick Hold-Geoffroy, Federico Perazzi, Jonathan Eisenmann, Kalyan Sunkavalli, Manmohan Chandraker
[pdf]
[DOI]

Procedure Planning in Instructional Videos
Chien-Yi Chang, De-An Huang, Danfei Xu, Ehsan Adeli, Li Fei-Fei, Juan Carlos Niebles
[pdf]
[DOI]

Funnel Activation for Visual Recognition
Ningning Ma, Xiangyu Zhang, Jian Sun
[pdf]
[DOI]

GIQA: Generated Image Quality Assessment
Shuyang Gu, Jianmin Bao, Dong Chen, Fang Wen
[pdf]
[DOI]

Adversarial Continual Learning
Sayna Ebrahimi, Franziska Meier, Roberto Calandra, Trevor Darrell, Marcus Rohrbach
[pdf]
[DOI]

Adapting Object Detectors with Conditional Domain Normalization
Peng Su, Kun Wang, Xingyu Zeng, Shixiang Tang, Dapeng Chen, Di Qiu , Xiaogang Wang
[pdf]
[DOI]

HARD-Net: Hardness-AwaRe Discrimination Network for 3D Early Activity Prediction
Tianjiao Li, Jun Liu, Wei Zhang, Lingyu Duan
[pdf]
[DOI]

Pseudo RGB-D for Self-Improving Monocular SLAM and Depth Prediction
Lokender Tiwari, Pan Ji, Quoc-Huy Tran, Bingbing Zhuang, Saket Anand , Manmohan Chandraker
[pdf]
[DOI]

Interpretable and Generalizable Person Re-Identification with Query-Adaptive Convolution and Temporal Lifting
Shengcai Liao, Ling Shao
[pdf]
[DOI]

Self-supervised Bayesian Deep Learning for Image Recovery with Applications to Compressive Sensing
Tongyao Pang, Yuhui Quan, Hui Ji
[pdf]
[DOI]

Graph-PCNN: Two Stage Human Pose Estimation with Graph Pose Refinement
Jian Wang, Xiang Long, Yuan Gao, Errui Ding, Shilei Wen
[pdf]
[DOI]

Semi-supervised Learning with a Teacher-student Network for Generalized Attribute Prediction
Minchul Shin
[pdf]
[DOI]

Unsupervised Domain Adaptation with Noise Resistible Mutual-Training for Person Re-identification
Fang Zhao, Shengcai Liao, Guo-Sen Xie, Jian Zhao, Kaihao Zhang, Ling Shao
[pdf]
[DOI]

DPDist: Comparing Point Clouds Using Deep Point Cloud Distance
Dahlia Urbach, Yizhak Ben-Shabat, Michael Lindenbaum
[pdf]
[DOI]

Bi-directional Cross-Modality Feature Propagation with Separation-and-Aggregation Gate for RGB-D Semantic Segmentation
Xiaokang Chen, Kwan-Yee Lin, Jingbo Wang, Wayne Wu, Chen Qian, Hongsheng Li, Gang Zeng
[pdf]
[DOI]

DataMix: Efficient Privacy-Preserving Edge-Cloud Inference
Zhijian Liu, Zhanghao Wu, Chuang Gan, Ligeng Zhu, Song Han
[pdf]
[DOI]

Neural Re-Rendering of Humans from a Single Image
Kripasindhu Sarkar, Dushyant Mehta, Weipeng Xu, Vladislav Golyanik, Christian Theobalt
[pdf]
[DOI]

Reversing the cycle: self-supervised deep stereo through enhanced monocular distillation
Filippo Aleotti, Fabio Tosi, Li Zhang, Matteo Poggi, Stefano Mattoccia
[pdf]
[DOI]

PIPAL: a Large-Scale Image Quality Assessment Dataset for Perceptual Image Restoration
Jinjin Gu, Haoming Cai, Haoyu Chen, Xiaoxing Ye, Jimmy S. Ren, Chao Dong
[pdf]
[DOI]

Why do These Match? Explaining the Behavior of Image Similarity Models
Bryan A. Plummer, Mariya I. Vasileva, Vitali Petsiuk, Kate Saenko, David Forsyth
[pdf]
[DOI]

CooGAN: A Memory-Efficient Framework for High-Resolution Facial Attribute Editing
Xuanhong Chen, Bingbing Ni, Naiyuan Liu, Ziang Liu, Yiliu Jiang, Loc Truong, Qi Tian
[pdf]
[DOI]

Progressive Transformers for End-to-End Sign Language Production
Ben Saunders, Necati Cihan Camgoz, Richard Bowden
[pdf]
[DOI]

Mask TextSpotter v3: Segmentation Proposal Network for Robust Scene Text Spotting
Minghui Liao, Guan Pang, Jing Huang, Tal Hassner, Xiang Bai
[pdf]
[DOI]

Making Affine Correspondences Work in Camera Geometry Computation
Daniel Barath, Michal Polic, Wolfgang Förstner, Torsten Sattler, Tomas Pajdla, Zuzana Kukelova
[pdf]
[DOI]

Sub-center ArcFace: Boosting Face Recognition by Large-scale Noisy Web Faces
Jiankang Deng, Jia Guo, Tongliang Liu, Mingming Gong, Stefanos Zafeiriou
[pdf]
[DOI]

Foley Music: Learning to Generate Music from Videos
Chuang Gan, Deng Huang, Peihao Chen, Joshua B. Tenenbaum, Antonio Torralba
[pdf]
[DOI]

Contrastive Multiview Coding
Yonglong Tian, Dilip Krishnan, Phillip Isola
[pdf]
[DOI]

Regional Homogeneity: Towards Learning Transferable Universal Adversarial Perturbations Against Defenses
Yingwei Li, Song Bai, Cihang Xie, Zhenyu Liao, Xiaohui Shen, Alan Yuille
[pdf]
[DOI]

Generative Low-bitwidth Data Free Quantization
Shoukai Xu, Haokun Li, Bohan Zhuang, Jing Liu, Jiezhang Cao, Chuangrun Liang, Mingkui Tan
[pdf]
[DOI]

Local Correlation Consistency for Knowledge Distillation
Xiaojie Li, Jianlong Wu, Hongyu Fang, Yue Liao, Fei Wang, Chen Qian
[pdf]
[DOI]

Perceiving 3D Human-Object Spatial Arrangements from a Single Image in the Wild
Jason Y. Zhang, Sam Pepose, Hanbyul Joo, Deva Ramanan, Jitendra Malik, Angjoo Kanazawa
[pdf]
[DOI]

Sep-Stereo: Visually Guided Stereophonic Audio Generation by Associating Source Separation
Hang Zhou, Xudong Xu, Dahua Lin, Xiaogang Wang, Ziwei Liu
[pdf]
[DOI]

CelebA-Spoof: Large-Scale Face Anti-Spoofing Dataset with Rich Annotations
Yuanhan Zhang, ZhenFei Yin, Yidong Li, Guojun Yin, Junjie Yan, Jing Shao, Ziwei Liu
[pdf]
[DOI]

Thinking in Frequency: Face Forgery Detection by Mining Frequency-aware Clues
Yuyang Qian, Guojun Yin, Lu Sheng, Zixuan Chen, Jing Shao
[pdf]
[DOI]

Weakly-Supervised Cell Tracking via Backward-and-Forward Propagation
Kazuya Nishimura, Junya Hayashida, Chenyang Wang, Dai Fei Elmer Ker, Ryoma Bise
[pdf]
[DOI]

SeqHAND: RGB-Sequence-Based 3D Hand Pose and Shape Estimation
John Yang, Hyung Jin Chang, Seungeui Lee, Nojun Kwak
[pdf]
[DOI]

Rethinking the Distribution Gap of Person Re-identification with Camera-based Batch Normalization
Zijie Zhuang, Longhui Wei, Lingxi Xie, Tianyu Zhang, Hengheng Zhang , Haozhe Wu, Haizhou Ai, Qi Tian
[pdf]
[DOI]

AMLN: Adversarial-based Mutual Learning Network for Online Knowledge Distillation
Xiaobing Zhang, Shijian Lu, Haigang Gong, Zhipeng Luo, Ming Liu
[pdf]
[DOI]

Online Multi-modal Person Search in Videos
Jiangyue Xia, Anyi Rao, Qingqiu Huang, Linning Xu, Jiangtao Wen, Dahua Lin
[pdf]
[DOI]

Single Image Super-Resolution via a Holistic Attention Network
Ben Niu, Weilei Wen, Wenqi Ren, Xiangde Zhang, Lianping Yang, Shuzhen Wang, Kaihao Zhang, Xiaochun Cao, Haifeng Shen
[pdf]
[DOI]

Can You Read Me Now? Content Aware Rectification using Angle Supervision
Amir Markovitz, Inbal Lavi, Or Perel, Shai Mazor, Roee Litman
[pdf]
[DOI]

Momentum Batch Normalization for Deep Learning with Small Batch Size
Hongwei Yong, Jianqiang Huang, Deyu Meng, Xiansheng Hua, Lei Zhang
[pdf]
[DOI]

AdvPC: Transferable Adversarial Perturbations on 3D Point Clouds
Abdullah Hamdi, Sara Rojas, Ali Thabet, Bernard Ghanem
[pdf]
[DOI]

Edge-aware Graph Representation Learning and Reasoning for Face Parsing
Gusi Te, Yinglu Liu, Wei Hu, Hailin Shi, Tao Mei
[pdf]
[DOI]

BBS-Net: RGB-D Salient Object Detection with a Bifurcated Backbone Strategy Network
Deng-Ping Fan, Yingjie Zhai, Ali Borji, Jufeng Yang, Ling Shao
[pdf]
[DOI]

G-LBM:Generative Low-dimensional Background Model Estimation from Video Sequences
Behnaz Rezaei, Amirreza Farnoosh, Sarah Ostadabbas
[pdf]
[DOI]

H3DNet: 3D Object Detection Using Hybrid Geometric Primitives
Zaiwei Zhang, Bo Sun, Haitao Yang, Qixing Huang
[pdf]
[DOI]

Expressive Telepresence via Modular Codec Avatars
Hang Chu, Shugao Ma, Fernando De la Torre, Sanja Fidler, Yaser Sheikh
[pdf]
[DOI]

Cascade Graph Neural Networks for RGB-D Salient Object Detection
Ao Luo, Xin Li, Fan Yang, Zhicheng Jiao, Hong Cheng, Siwei Lyu
[pdf]
[DOI]

FairALM: Augmented Lagrangian Method for Training Fair Models with Little Regret
Vishnu Suresh Lokhande, Aditya Kumar Akash, Sathya N. Ravi, Vikas Singh
[pdf]
[DOI]

Generating Videos of Zero-Shot Compositions of Actions and Objects
Megha Nawhal, Mengyao Zhai, Andreas Lehrmann, Leonid Sigal, Greg Mori
[pdf]
[DOI]

ViTAA: Visual-Textual Attributes Alignment in Person Search by Natural Language
Zhe Wang, Zhiyuan Fang, Jun Wang, Yezhou Yang
[pdf]
[DOI]

Renovating Parsing R-CNN for Accurate Multiple Human Parsing
Lu Yang, Qing Song, Zhihui Wang, Mengjie Hu, Chun Liu, Xueshi Xin, Wenhe Jia, Songcen Xu
[pdf]
[DOI]

Multi-Task Curriculum Framework for Open-Set Semi-Supervised Learning
Qing Yu, Daiki Ikami, Go Irie, Kiyoharu Aizawa
[pdf]
[DOI]

Gradient-Induced Co-Saliency Detection
Zhao Zhang, Wenda Jin, Jun Xu, Ming-Ming Cheng
[pdf]
[DOI]

Nighttime Defogging Using High-Low Frequency Decomposition and Grayscale-Color Networks
Wending Yan, Robby T. Tan, Dengxin Dai
[pdf]
[DOI]

SegFix: Model-Agnostic Boundary Refinement for Segmentation
Yuhui Yuan, Jingyi Xie, Xilin Chen, Jingdong Wang
[pdf]
[DOI]

Spatio-Temporal Graph Transformer Networks for Pedestrian Trajectory Prediction
Cunjun Yu, Xiao Ma, Jiawei Ren, Haiyu Zhao, Shuai Yi
[pdf]
[DOI]

Fast Bi-layer Neural Synthesis of One-Shot Realistic Head Avatars
Egor Zakharov, Aleksei Ivakhnenko, Aliaksandra Shysheya, Victor Lempitsky
[pdf]
[DOI]

Neural Geometric Parser for Single Image Camera Calibration
Jinwoo Lee, Minhyuk Sung, Hyunjoon Lee, Junho Kim
[pdf]
[DOI]

Learning Flow-based Feature Warping for Face Frontalization with Illumination Inconsistent Supervision
Yuxiang Wei, Ming Liu, Haolin Wang, Ruifeng Zhu, Guosheng Hu, Wangmeng Zuo
[pdf]
[DOI]

Learning Architectures for Binary Networks
Dahyun Kim, Kunal Pratap Singh, Jonghyun Choi
[pdf]
[DOI]

Semantic View Synthesis
Hsin-Ping Huang, Hung-Yu Tseng, Hsin-Ying Lee, Jia-Bin Huang
[pdf]
[DOI]

An Analysis of Sketched IRLS for Accelerated Sparse Residual Regression
Daichi Iwata, Michael Waechter, Wen-Yan Lin, Yasuyuki Matsushita
[pdf]
[DOI]

Relative Pose from Deep Learned Depth and a Single Affine Correspondence
Ivan Eichhardt, Daniel Barath
[pdf]
[DOI]

Video Super-Resolution with Recurrent Structure-Detail Network
Takashi Isobe, Xu Jia, Shuhang Gu, Songjiang Li, Shengjin Wang, Qi Tian
[pdf]
[DOI]

Shape Adaptor: A Learnable Resizing Module
Shikun Liu, Zhe Lin, Yilin Wang, Jianming Zhang, Federico Perazzi, Edward Johns
[pdf]
[DOI]

Shuffle and Attend: Video Domain Adaptation
Jinwoo Choi, Gaurav Sharma, Samuel Schulter, Jia-Bin Huang
[pdf]
[DOI]

DRG: Dual Relation Graph for Human-Object Interaction Detection
Chen Gao, Jiarui Xu, Yuliang Zou, Jia-Bin Huang
[pdf]
[DOI]

Flow-edge Guided Video Completion
Chen Gao, Ayush Saraf, Jia-Bin Huang, Johannes Kopf
[pdf]
[DOI]

End-to-End Trainable Deep Active Contour Models for Automated Image Segmentation: Delineating Buildings in Aerial Imagery
Ali Hatamizadeh, Debleena Sengupta, Demetri Terzopoulos
[pdf]
[DOI]

Towards End-to-end Video-based Eye-Tracking
Seonwook Park, Emre Aksan, Xucong Zhang, Otmar Hilliges
[pdf]
[DOI]

Generating Handwriting via Decoupled Style Descriptors
Atsunobu Kotani, Stefanie Tellex, James Tompkin
[pdf]
[DOI]

LEED: Label-Free Expression Editing via Disentanglement
Rongliang Wu, Shijian Lu
[pdf]
[DOI]

Fashion Captioning: Towards Generating Accurate Descriptions with Semantic Rewards
Xuewen Yang, Heming Zhang, Di Jin, Yingru Liu, Chi-Hao Wu, Jianchao Tan, Dongliang Xie, Jue Wang, Xin Wang
[pdf]
[DOI]

Reducing Language Biases in Visual Question Answering with Visually-Grounded Question Encoder
Gouthaman KV, Anurag Mittal
[pdf]
[DOI]

Unsupervised Cross-Modal Alignment for Multi-Person 3D Pose Estimation
Jogendra Nath Kundu, Ambareesh Revanur, Govind Vitthal Waghmare, Rahul Mysore Venkatesh, R. Venkatesh Babu
[pdf]
[DOI]

Class-Incremental Domain Adaptation
Jogendra Nath Kundu, Rahul Mysore Venkatesh, Naveen Venkat, Ambareesh Revanur, R. Venkatesh Babu
[pdf]
[DOI]

Anti-Bandit Neural Architecture Search for Model Defense
Hanlin Chen, Baochang Zhang, Song Xue, Xuan Gong, Hong Liu, Rongrong Ji, David Doermann
[pdf]
[DOI]

Wavelet-Based Dual-Branch Network for Image Demoiréing
Lin Liu, Jianzhuang Liu, Shanxin Yuan, Gregory Slabaugh, Aleš Leonardis, Wengang Zhou, Qi Tian
[pdf]
[DOI]

Low Light Video Enhancement using Synthetic Data Produced with an Intermediate Domain Mapping
Danai Triantafyllidou, Sean Moran, Steven McDonagh, Sarah Parisot, Gregory Slabaugh
[pdf]
[DOI]

Non-Local Spatial Propagation Network for Depth Completion
Jinsun Park, Kyungdon Joo, Zhe Hu, Chi-Kuei Liu, In So Kweon
[pdf]
[DOI]

DanbooRegion: An Illustration Region Dataset
Lvmin Zhang, Yi JI, Chunping Liu
[pdf]
[DOI]

Event Enhanced High-Quality Image Recovery
Bishan Wang, Jingwei He, Lei Yu, Gui-Song Xia, Wen Yang
[pdf]
[DOI]

PackDet: Packed Long-Head Object Detector
Kun Ding, Guojin He, Huxiang Gu, Zisha Zhong, Shiming Xiang, Chunhong Pan
[pdf]
[DOI]

A Generic Graph-based Neural Architecture Encoding Scheme for Predictor-based NAS
Xuefei Ning, Yin Zheng, Tianchen Zhao, Yu Wang, Huazhong Yang
[pdf]
[DOI]

Learning Semantic Neural Tree for Human Parsing
Ruyi Ji, Dawei Du, Libo Zhang, Longyin Wen, Yanjun Wu, Chen Zhao, Feiyue Huang, Siwei Lyu
[pdf]
[DOI]

Sketching Image Gist: Human-Mimetic Hierarchical Scene Graph Generation
Wenbin Wang, Ruiping Wang, Shiguang Shan, Xilin Chen
[pdf]
[DOI]

Burst Denoising via Temporally Shifted Wavelet Transforms
Xuejian Rong, Denis Demandolx, Kevin Matzen, Priyam Chatterjee, Yingli Tian
[pdf]
[DOI]

JSSR: A Joint Synthesis, Segmentation, and Registration System for 3D Multi-Modal Image Alignment of Large-scale Pathological CT Scans
Fengze Liu, Jinzheng Cai, Yuankai Huo, Chi-Tung Cheng, Ashwin Raju, Dakai Jin, Jing Xiao, Alan Yuille, Le Lu, ChienHung Liao, Adam P. Harrison
[pdf]
[DOI]

SimAug: Learning Robust Representations from Simulation for Trajectory Prediction
Junwei Liang, Lu Jiang, Alexander Hauptmann
[pdf]
[DOI]

ScribbleBox: Interactive Annotation Framework for Video Object Segmentation
Bowen Chen, Huan Ling, Xiaohui Zeng, Jun Gao, Ziyue Xu, Sanja Fidler
[pdf]
[DOI]

Rethinking Pseudo-LiDAR Representation
Xinzhu Ma, Shinan Liu, Zhiyi Xia, Hongwen Zhang, Xingyu Zeng, Wanli Ouyang
[pdf]
[DOI]

Deep Multi Depth Panoramas for View Synthesis
Kai-En Lin, Zexiang Xu, Ben Mildenhall, Pratul P. Srinivasan, Yannick Hold-Geoffroy, Stephen DiVerdi, Qi Sun, Kalyan Sunkavalli, Ravi Ramamoorthi
[pdf]
[DOI]

MINI-Net: Multiple Instance Ranking Network for Video Highlight Detection
Fa-Ting Hong, Xuanteng Huang, Wei-Hong Li, Wei-Shi Zheng
[pdf]
[DOI]

ContactPose: A Dataset of Grasps with Object Contact and Hand Pose
Samarth Brahmbhatt, Chengcheng Tang, Christopher D. Twigg, Charles C. Kemp, James Hays
[pdf]
[DOI]

API-Net: Robust Generative Classifier via a Single Discriminator
Xinshuai Dong, Hong Liu, Rongrong Ji, Liujuan Cao, Qixiang Ye, Jianzhuang Liu, Qi Tian
[pdf]
[DOI]

Bias-based Universal Adversarial Patch Attack for Automatic Check-out
Aishan Liu, Jiakai Wang, Xianglong Liu, Bowen Cao, Chongzhi Zhang, Hang Yu
[pdf]
[DOI]

Imbalanced Continual Learning with Partitioning Reservoir Sampling
Chris Dongjoo Kim, Jinseo Jeong, Gunhee Kim
[pdf]
[DOI]

Guided Collaborative Training for Pixel-wise Semi-Supervised Learning
Zhanghan Ke, Di Qiu, Kaican Li, Qiong Yan, Rynson W.H. Lau
[pdf]
[DOI]

Stacking Networks Dynamically for Image Restoration Based on the Plug-and-Play Framework
Haixin Wang, Tianhao Zhang, Muzhi Yu, Jinan Sun, Wei Ye, Chen Wang , Shikun Zhang
[pdf]
[DOI]

Efficient Transfer Learning via Joint Adaptation of Network Architecture and Weight
Ming Sun, Haoxuan Dou, Junjie Yan
[pdf]
[DOI]

Spatial Attention Pyramid Network for Unsupervised Domain Adaptation
Congcong Li, Dawei Du, Libo Zhang, Longyin Wen, Tiejian Luo, Yanjun Wu, Pengfei Zhu
[pdf]
[DOI]

GSIR: Generalizable 3D Shape Interpretation and Reconstruction
Jianren Wang, Zhaoyuan Fang
[pdf]
[DOI]

Weakly Supervised 3D Object Detection from Lidar Point Cloud
Qinghao Meng, Wenguan Wang, Tianfei Zhou, Jianbing Shen, Luc Van Gool , Dengxin Dai
[pdf]
[DOI]

Two-phase Pseudo Label Densification for Self-training based Domain Adaptation
Inkyu Shin, Sanghyun Woo, Fei Pan, In So Kweon
[pdf]
[DOI]

Adaptive Offline Quintuplet Loss for Image-Text Matching
Tianlang Chen, Jiajun Deng, Jiebo Luo
[pdf]
[DOI]

Learning Object Placement by Inpainting for Compositional Data Augmentation
Lingzhi Zhang, Tarmily Wen, Jie Min, Jiancong Wang, David Han, Jianbo Shi
[pdf]
[DOI]

Deep Vectorization of Technical Drawings
Vage Egiazarian, Oleg Voynov, Alexey Artemov, Denis Volkhonskiy, Aleksandr Safin, Maria Taktasheva, Denis Zorin, Evgeny Burnaev
[pdf]
[DOI]

CAD-Deform: Deformable Fitting of CAD Models to 3D Scans
Vladislav Ishimtsev, Alexey Bokhovkin, Alexey Artemov, Savva Ignatyev , Matthias Niessner, Denis Zorin, Evgeny Burnaev
[pdf]
[DOI]

An Image Enhancing Pattern-based Sparsity for Real-time Inference on Mobile Devices
Xiaolong Ma, Wei Niu, Tianyun Zhang, Sijia Liu, Sheng Lin, Hongjia Li, Wujie Wen, Xiang Chen, Jian Tang, Kaisheng Ma, Bin Ren, Yanzhi Wang
[pdf]
[DOI]

AutoTrajectory: Label-free Trajectory Extraction and Prediction from Videos using Dynamic Points
Yuexin Ma, Xinge Zhu, Xinjing Cheng, Ruigang Yang, Jiming Liu, Dinesh Manocha
[pdf]
[DOI]

Multi-Agent Embodied Question Answering in Interactive Environments
Sinan Tan, Weilai Xiang, Huaping Liu, Di Guo, Fuchun Sun
[pdf]
[DOI]

Conditional Sequential Modulation for Efficient Global Image Retouching
Jingwen He, Yihao Liu, Yu Qiao, Chao Dong
[pdf]
[DOI]

Segmenting Transparent Objects in the Wild
Enze Xie, Wenjia Wang, Wenhai Wang, Mingyu Ding, Chunhua Shen, Ping Luo
[pdf]
[DOI]

Length-Controllable Image Captioning
Chaorui Deng, Ning Ding, Mingkui Tan, Qi Wu
[pdf]
[DOI]

Few-Shot Semantic Segmentation with Democratic Attention Networks
Haochen Wang, Xudong Zhang, Yutao Hu, Yandan Yang, Xianbin Cao, Xiantong Zhen
[pdf]
[DOI]

Defocus Blur Detection via Depth Distillation
Xiaodong Cun, Chi-Man Pun
[pdf]
[DOI]

Motion Guided 3D Pose Estimation from Videos
Jingbo Wang, Sijie Yan, Yuanjun Xiong, Dahua Lin
[pdf]
[DOI]

Reflection Separation via Multi-bounce Polarization State Tracing
Rui Li, Simeng Qiu, Guangming Zang, Wolfgang Heidrich
[pdf]
[DOI]

SipMask: Spatial Information Preservation for Fast Image and Video Instance Segmentation
Jiale Cao, Rao Muhammad Anwer, Hisham Cholakkal, Fahad Shahbaz Khan, Yanwei Pang, Ling Shao
[pdf]
[DOI]

SemanticAdv: Generating Adversarial Examples via Attribute-conditioned Image Editing
Haonan Qiu, Chaowei Xiao, Lei Yang, Xinchen Yan, Honglak Lee, Bo Li
[pdf]
[DOI]

Learning with Noisy Class Labels for Instance Segmentation
Longrong Yang, Fanman Meng, Hongliang Li, Qingbo Wu, Qishang Cheng
[pdf]
[DOI]

Deep Image Clustering with Category-Style Representation
Junjie Zhao, Donghuan Lu, Kai Ma, Yu Zhang, Yefeng Zheng
[pdf]
[DOI]

Self-supervised Motion Representation via Scattering Local Motion Cues
Yuan Tian, Zhaohui Che, Wenbo Bao, Guangtao Zhai, Zhiyong Gao
[pdf]
[DOI]

Improving Monocular Depth Estimation by Leveraging Structural Awareness and Complementary Datasets
Tian Chen, Shijie An, Yuan Zhang, Chongyang Ma , Huayan Wang, Xiaoyan Guo, Wen Zheng
[pdf]
[DOI]

BMBC: Bilateral Motion Estimation with Bilateral Cost Volume for Video Interpolation
Junheum Park, Keunsoo Ko, Chul Lee, Chang-Su Kim
[pdf]
[DOI]

Hard negative examples are hard, but useful
Hong Xuan, Abby Stylianou, Xiaotong Liu, Robert Pless
[pdf]
[DOI]

ReActNet: Towards Precise Binary Neural Network with Generalized Activation Functions
Zechun Liu, Zhiqiang Shen, Marios Savvides, Kwang-Ting Cheng
[pdf]
[DOI]

Video Object Detection via Object-level Temporal Aggregation
Chun-Han Yao, Chen Fang, Xiaohui Shen, Yangyue Wan, Ming-Hsuan Yang
[pdf]
[DOI]

Object Detection with a Unified Label Space from Multiple Datasets
Xiangyun Zhao, Samuel Schulter, Gaurav Sharma, Yi-Hsuan Tsai, Manmohan Chandraker, Ying Wu
[pdf]
[DOI]

Lift, Splat, Shoot: Encoding Images from Arbitrary Camera Rigs by Implicitly Unprojecting to 3D
Jonah Philion, Sanja Fidler
[pdf]
[DOI]

Comprehensive Image Captioning via Scene Graph Decomposition
Yiwu Zhong, Liwei Wang, Jianshu Chen, Dong Yu, Yin Li
[pdf]
[DOI]

Symbiotic Adversarial Learning for Attribute-based Person Search
Yu-Tong Cao, Jingya Wang, Dacheng Tao
[pdf]
[DOI]

Amplifying Key Cues for Human-Object-Interaction Detection
Yang Liu, Qingchao Chen, Andrew Zisserman
[pdf]
[DOI]

Rethinking Few-shot Image Classification: A Good Embedding is All You Need?
Yonglong Tian, Yue Wang, Dilip Krishnan, Joshua B. Tenenbaum, Phillip Isola
[pdf]
[DOI]

Adversarial Background-Aware Loss for Weakly-supervised Temporal Activity Localization
Kyle Min, Jason J. Corso
[pdf]
[DOI]

Action Localization through Continual Predictive Learning
Sathyanarayanan Aakur, Sudeep Sarkar
[pdf]
[DOI]

Generative View-Correlation Adaptation for Semi-Supervised Multi-View Learning
Yunyu Liu, Lichen Wang, Yue Bai, Can Qin, Zhengming Ding, Yun Fu
[pdf]
[DOI]

READ: Reciprocal Attention Discriminator for Image-to-Video Re-Identification
Minho Shim, Hsuan-I Ho, Jinhyung Kim, Dongyoon Wee
[pdf]
[DOI]

3D Human Shape Reconstruction from a Polarization Image
Shihao Zou, Xinxin Zuo, Yiming Qian, Sen Wang, Chi Xu, Minglun Gong , Li Cheng
[pdf]
[DOI]

The Devil is in the Details: Self-Supervised Attention for Vehicle Re-Identification
Pirazh Khorramshahi, Neehar Peri, Jun-cheng Chen, Rama Chellappa
[pdf]
[DOI]

Improving One-stage Visual Grounding by Recursive Sub-query Construction
Zhengyuan Yang, Tianlang Chen, Liwei Wang, Jiebo Luo
[pdf]
[DOI]

Multi-level Wavelet-based Generative Adversarial Network for Perceptual Quality Enhancement of Compressed Video
Jianyi Wang, Xin Deng, Mai Xu, Congyong Chen, Yuhang Song
[pdf]
[DOI]

Example-Guided Image Synthesis using Masked Spatial-Channel Attention and Self-Supervision
Haitian Zheng, Haofu Liao, Lele Chen, Wei Xiong, Tianlang Chen, Jiebo Luo
[pdf]
[DOI]

Content-Consistent Matching for Domain Adaptive Semantic Segmentation
Guangrui Li, Guoliang Kang, Wu Liu, Yunchao Wei, Yi Yang
[pdf]
[DOI]

AE TextSpotter: Learning Visual and Linguistic Representation for Ambiguous Text Spotting
Wenhai Wang, Xuebo Liu, Xiaozhong Ji, Enze Xie, Ding Liang, ZhiBo Yang, Tong Lu, Chunhua Shen, Ping Luo
[pdf]
[DOI]

History Repeats Itself: Human Motion Prediction via Motion Attention
Wei Mao, Miaomiao Liu, Mathieu Salzmann
[pdf]
[DOI]

Unsupervised Video Object Segmentation with Joint Hotspot Tracking
Lu Zhang, Jianming Zhang, Zhe Lin, Radomír Měch, Huchuan Lu, You He
[pdf]
[DOI]

SRNet: Improving Generalization in 3D Human Pose Estimation with a Split-and-Recombine Approach
Ailing Zeng, Xiao Sun, Fuyang Huang, Minhao Liu, Qiang Xu, Stephen Lin
[pdf]
[DOI]

CAFE-GAN: Arbitrary Face Attribute Editing with Complementary Attention Feature
Jeong gi Kwak, David K. Han, Hanseok Ko
[pdf]
[DOI]

MimicDet: Bridging the Gap Between One-Stage and Two-Stage Object Detection
Xin Lu, Quanquan Li, Buyu Li, Junjie Yan
[pdf]
[DOI]

Latent Topic-aware Multi-Label Classification
Jianghong Ma, Yang Liu
[pdf]
[DOI]

Finding It at Another Side: A Viewpoint-Adapted Matching Encoder for Change Captioning
Xiangxi Shi, Xu Yang, Jiuxiang Gu, Shafiq Joty, Jianfei Cai
[pdf]
[DOI]

Attract, Perturb, and Explore: Learning a Feature Alignment Network for Semi-supervised Domain Adaptation
Taekyung Kim, Changick Kim
[pdf]
[DOI]

Curriculum Manager for Source Selection in Multi-Source Domain Adaptation
Luyu Yang, Yogesh Balaji, Ser-Nam Lim, Abhinav Shrivastava
[pdf]
[DOI]

Powering One-shot Topological NAS with Stabilized Share-parameter Proxy
Ronghao Guo, Chen Lin, Chuming Li, Keyu Tian, Ming Sun, Lu Sheng, Junjie Yan
[pdf]
[DOI]

Classes Matter: A Fine-grained Adversarial Approach to Cross-domain Semantic Segmentation
Haoran Wang, Tong Shen, Wei Zhang, Ling-Yu Duan, Tao Mei
[pdf]
[DOI]

Boundary-preserving Mask R-CNN
Tianheng Cheng, Xinggang Wang, Lichao Huang, Wenyu Liu
[pdf]
[DOI]

Self-supervised Single-view 3D Reconstruction via Semantic Consistency
Xueting Li, Sifei Liu, Kihwan Kim, Shalini De Mello, Varun Jampani, Ming-Hsuan Yang, Jan Kautz
[pdf]
[DOI]

MetaDistiller: Network Self-Boosting via Meta-Learned Top-Down Distillation
Benlin Liu, Yongming Rao, Jiwen Lu, Jie Zhou, Cho-Jui Hsieh
[pdf]
[DOI]

Learning Monocular Visual Odometry via Self-Supervised Long-Term Modeling
Yuliang Zou, Pan Ji, Quoc-Huy Tran, Jia-Bin Huang, Manmohan Chandraker
[pdf]
[DOI]

The Devil is in Classification: A Simple Framework for Long-tail Instance Segmentation
Tao Wang, Yu Li, Bingyi Kang, Junnan Li, Junhao Liew, Sheng Tang, Steven Hoi, Jiashi Feng
[pdf]
[DOI]

What is Learned in Deep Uncalibrated Photometric Stereo?
Guanying Chen, Michael Waechter, Boxin Shi, Kwan-Yee K. Wong, Yasuyuki Matsushita
[pdf]
[DOI]

Prior-based Domain Adaptive Object Detection for Hazy and Rainy Conditions
Vishwanath A. Sindagi, Poojan Oza, Rajeev Yasarla, Vishal M. Patel
[pdf]
[DOI]

Adversarial Ranking Attack and Defense
Mo Zhou, Zhenxing Niu, Le Wang, Qilin Zhang, Gang Hua
[pdf]
[DOI]

ReDro: Efficiently Learning Large-sized SPD Visual Representation
Saimunur Rahman, Lei Wang, Changming Sun, Luping Zhou
[pdf]
[DOI]

Graph-Based Social Relation Reasoning
Wanhua Li, Yueqi Duan, Jiwen Lu, Jianjiang Feng, Jie Zhou
[pdf]
[DOI]

EPNet: Enhancing Point Features with Image Semantics for 3D Object Detection
Tengteng Huang, Zhe Liu, Xiwu Chen, Xiang Bai
[pdf]
[DOI]

Self-Supervised Monocular 3D Face Reconstruction by Occlusion-Aware Multi-view Geometry Consistency
Jiaxiang Shang, Tianwei Shen, Shiwei li, Lei Zhou, Mingmin Zhen, Tian Fang, Long Quan
[pdf]
[DOI]

Asynchronous Interaction Aggregation for Action Detection
Jiajun Tang, Jin Xia, Xinzhi Mu, Bo Pang, Cewu Lu
[pdf]
[DOI]

Shape and Viewpoint without Keypoints
Shubham Goel, Angjoo Kanazawa, Jitendra Malik
[pdf]
[DOI]

Learning Attentive and Hierarchical Representations for 3D Shape Recognition
Jiaxin Chen, Jie Qin, Yuming Shen, Li Liu, Fan Zhu, Ling Shao
[pdf]
[DOI]

TF-NAS: Rethinking Three Search Freedoms of Latency-Constrained Differentiable Neural Architecture Search
Yibo Hu, Xiang Wu, Ran He
[pdf]
[DOI]

Associative3D: Volumetric Reconstruction from Sparse Views
Shengyi Qian, Linyi Jin, David F. Fouhey
[pdf]
[DOI]

PlugNet: Degradation Aware Scene Text Recognition Supervised by a Pluggable Super-Resolution Unit
Yongqiang Mou, Lei Tan, Hui Yang, Jingying Chen, Leyuan Liu, Rui Yan, Yaohong Huang
[pdf]
[DOI]

Memory Selection Network for Video Propagation
Ruizheng Wu, Huaijia Lin, Xiaojuan Qi, Jiaya Jia
[pdf]
[DOI]

Disentangled Non-local Neural Networks
Minghao Yin, Zhuliang Yao, Yue Cao, Xiu Li, Zheng Zhang, Stephen Lin, Han Hu
[pdf]
[DOI]

URVOS: Unified Referring Video Object Segmentation Network with a Large-Scale Benchmark
Seonguk Seo, Joon-Young Lee, Bohyung Han
[pdf]
[DOI]

Generalizing Person Re-Identification by Camera-Aware Invariance Learning and Cross-Domain Mixup
Chuanchen Luo, Chunfeng Song, Zhaoxiang Zhang
[pdf]
[DOI]

Semi-Supervised Crowd Counting via Self-Training on Surrogate Tasks
Yan Liu, Lingqiao Liu, Peng Wang, Pingping Zhang, Yinjie Lei
[pdf]
[DOI]

Dynamic R-CNN: Towards High Quality Object Detection via Dynamic Training
Hongkai Zhang, Hong Chang, Bingpeng Ma, Naiyan Wang, Xilin Chen
[pdf]
[DOI]

Boosting Decision-based Black-box Adversarial Attacks with Random Sign Flip
Weilun Chen, Zhaoxiang Zhang, Xiaolin Hu, Baoyuan Wu
[pdf]
[DOI]

Knowledge Transfer via Dense Cross-Layer Mutual-Distillation
Anbang Yao, Dawei Sun
[pdf]
[DOI]

Matching Guided Distillation
Kaiyu Yue, Jiangfan Deng, Feng Zhou
[pdf]
[DOI]

Clustering Driven Deep Autoencoder for Video Anomaly Detection
Yunpeng Chang, Zhigang Tu, Wei Xie, Junsong Yuan
[pdf]
[DOI]

Learning to Compose Hypercolumns for Visual Correspondence
Juhong Min, Jongmin Lee, Jean Ponce, Minsu Cho
[pdf]
[DOI]

Stochastic Bundle Adjustment for Efficient and Scalable 3D Reconstruction
Lei Zhou, Zixin Luo, Mingmin Zhen, Tianwei Shen, Shiwei Li, Zhuofei Huang, Tian Fang, Long Quan
[pdf]
[DOI]

Object-based Illumination Estimation with Rendering-aware Neural Networks
Xin Wei, Guojun Chen, Yue Dong, Stephen Lin, Xin Tong
[pdf]
[DOI]

Progressive Point Cloud Deconvolution Generation Network
Le Hui, Rui Xu, Jin Xie, Jianjun Qian, Jian Yang
[pdf]
[DOI]

SSCGAN: Facial Attribute Editing via Style Skip Connections
Wenqing Chu, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang, Rongrong Ji
[pdf]
[DOI]

Negative Pseudo Labeling using Class Proportion for Semantic Segmentation in Pathology
Hiroki Tokunaga, Brian Kenji Iwana, Yuki Teramoto, Akihiko Yoshizawa , Ryoma Bise
[pdf]
[DOI]

Learn to Propagate Reliably on Noisy Affinity Graphs
Lei Yang, Qingqiu Huang, Huaiyi Huang, Linning Xu, Dahua Lin
[pdf]
[DOI]

Fair DARTS: Eliminating Unfair Advantages in Differentiable Architecture Search
Xiangxiang Chu, Tianbao Zhou, Bo Zhang, Jixiang Li
[pdf]
[DOI]

TANet: Towards Fully Automatic Tooth Arrangement
Guodong Wei, Zhiming Cui, Yumeng Liu, Nenglun Chen, Runnan Chen, Guiqing Li, Wenping Wang
[pdf]
[DOI]

UnionDet: Union-Level Detector Towards Real-Time Human-Object Interaction Detection
Bumsoo Kim, Taeho Choi, Jaewoo Kang, Hyunwoo J. Kim
[pdf]
[DOI]

GSNet: Joint Vehicle Pose and Shape Reconstruction with Geometrical and Scene-aware Supervision
Lei Ke, Shichao Li, Yanan Sun, Yu-Wing Tai, Chi-Keung Tang
[pdf]
[DOI]

Resolution Switchable Networks for Runtime Efficient Image Recognition
Yikai Wang, Fuchun Sun, Duo Li, Anbang Yao
[pdf]
[DOI]

SMAP: Single-Shot Multi-Person Absolute 3D Pose Estimation
Jianan Zhen, Qi Fang, Jiaming Sun, Wentao Liu, Wei Jiang, Hujun Bao , Xiaowei Zhou
[pdf]
[DOI]

Learning to Detect Open Classes for Universal Domain Adaptation
Bo Fu, Zhangjie Cao, Mingsheng Long, Jianmin Wang
[pdf]
[DOI]

Visual Compositional Learning for Human-Object Interaction Detection
Zhi Hou, Xiaojiang Peng, Yu Qiao, Dacheng Tao
[pdf]
[DOI]

Deep Plastic Surgery: Robust and Controllable Image Editing with Human-Drawn Sketches
Shuai Yang, Zhangyang Wang, Jiaying Liu, Zongming Guo
[pdf]
[DOI]

Rethinking Class Activation Mapping for Weakly Supervised Object Localization
Wonho Bae, Junhyug Noh, Gunhee Kim
[pdf]
[DOI]

OS2D: One-Stage One-Shot Object Detection by Matching Anchor Features
Anton Osokin, Denis Sumin, Vasily Lomakin
[pdf]
[DOI]

Interpretable Neural Network Decoupling
Yuchao Li, Rongrong Ji, Shaohui Lin, Baochang Zhang, Chenqian Yan, Yongjian Wu, Feiyue Huang, Ling Shao
[pdf]
[DOI]

Omni-sourced Webly-supervised Learning for Video Recognition
Haodong Duan, Yue Zhao, Yuanjun Xiong, Wentao Liu, Dahua Lin
[pdf]
[DOI]

CurveLane-NAS: Unifying Lane-Sensitive Architecture Search and Adaptive Point Blending
Hang Xu, Shaoju Wang, Xinyue Cai, Wei Zhang, Xiaodan Liang, Zhenguo Li
[pdf]
[DOI]

Contextual-Relation Consistent Domain Adaptation for Semantic Segmentation
Jiaxing Huang, Shijian Lu, Dayan Guan, Xiaobing Zhang
[pdf]
[DOI]

Estimating People Flows to Better Count Them in Crowded Scenes
Weizhe Liu, Mathieu Salzmann, Pascal Fua
[pdf]
[DOI]

Generate to Adapt: Resolution Adaption Network for Surveillance Face Recognition
Han Fang, Weihong Deng, Yaoyao Zhong, Jiani Hu
[pdf]
[DOI]

Learning Feature Embeddings for Discriminant Model based Tracking
Linyu Zheng, Ming Tang, Yingying Chen, Jinqiao Wang, Hanqing Lu
[pdf]
[DOI]

WeightNet: Revisiting the Design Space of Weight Networks
Ningning Ma, Xiangyu Zhang, Jiawei Huang, Jian Sun
[pdf]
[DOI]

Partially-Shared Variational Auto-encoders for Unsupervised Domain Adaptation with Target Shift
Ryuhei Takahashi, Atsushi Hashimoto, Motoharu Sonogashira, Masaaki Iiyama
[pdf]
[DOI]

Learning Where to Focus for Efficient Video Object Detection
Zhengkai Jiang, Yu Liu, Ceyuan Yang, Jihao Liu, Peng Gao, Qian Zhang, Shiming Xiang, Chunhong Pan
[pdf]
[DOI]

Learning Object Permanence from Video
Aviv Shamsian, Ofri Kleinfeld, Amir Globerson, Gal Chechik
[pdf]
[DOI]

Adaptive Text Recognition through Visual Matching
Chuhan Zhang, Ankush Gupta, Andrew Zisserman
[pdf]
[DOI]

Actions as Moving Points
Yixuan Li, Zixu Wang, Limin Wang, Gangshan Wu
[pdf]
[DOI]

Learning to Exploit Multiple Vision Modalities by Using Grafted Networks
Yuhuang Hu, Tobi Delbruck, Shih-Chii Liu
[pdf]
[DOI]

Geometric Correspondence Fields: Learned Differentiable Rendering for 3D Pose Refinement in the Wild
Alexander Grabner, Yaming Wang, Peizhao Zhang, Peihong Guo, Tong Xiao, Peter Vajda, Peter M. Roth, Vincent Lepetit
[pdf]
[DOI]

3D Fluid Flow Reconstruction Using Compact Light Field PIV
Zhong Li, Yu Ji, Jingyi Yu, Jinwei Ye
[pdf]
[DOI]

Contextual Diversity for Active Learning
Sharat Agarwal, Himanshu Arora, Saket Anand, Chetan Arora
[pdf]
[DOI]

Temporal Aggregate Representations for Long-Range Video Understanding
Fadime Sener, Dipika Singhania, Angela Yao
[pdf]
[DOI]

Stochastic Fine-grained Labeling of Multi-state Sign Glosses for Continuous Sign Language Recognition
Zhe Niu, Brian Mak
[pdf]
[DOI]

General 3D Room Layout from a Single View by Render-and-Compare
Sinisa Stekovic, Shreyas Hampali, Mahdi Rad, Sayan Deb Sarkar, Friedrich Fraundorfer, Vincent Lepetit
[pdf]
[DOI]

Neural Dense Non-Rigid Structure from Motion with Latent Space Constraints
Vikramjit Sidhu, Edgar Tretschk, Vladislav Golyanik, Antonio Agudo, Christian Theobalt
[pdf]
[DOI]

Multimodal Memorability: Modeling Effects of Semantics and Decay on Video Memorability
Anelise Newman, Camilo Fosco, Vincent Casser, Allen Lee, Barry McNamara, Aude Oliva
[pdf]
[DOI]

Yet Another Intermediate-Level Attack
Qizhang Li, Yiwen Guo, Hao Chen
[pdf]
[DOI]

Topology-Change-Aware Volumetric Fusion for Dynamic Scene Reconstruction
Chao Li, Xiaohu Guo
[pdf]
[DOI]

Early Exit Or Not: Resource-Efficient Blind Quality Enhancement for Compressed Images
Qunliang Xing, Mai Xu, Tianyi Li, Zhenyu Guan
[pdf]
[DOI]

PatchNets: Patch-Based Generalizable Deep Implicit 3D Shape Representations
Edgar Tretschk, Ayush Tewari, Vladislav Golyanik, Michael Zollhöfer, Carsten Stoll, Christian Theobalt
[pdf]
[DOI]

How does Lipschitz Regularization Influence GAN Training?
Yipeng Qin, Niloy Mitra, Peter Wonka
[pdf]
[DOI]

Infrastructure-based Multi-Camera Calibration using Radial Projections
Yukai Lin, Viktor Larsson, Marcel Geppert, Zuzana Kukelova, Marc Pollefeys, Torsten Sattler
[pdf]
[DOI]

MotionSqueeze: Neural Motion Feature Learning for Video Understanding
Heeseung Kwon, Manjin Kim, Suha Kwak, Minsu Cho
[pdf]
[DOI]

Polarized Optical-Flow Gyroscope
Masada Tzabari, Yoav Y. Schechner
[pdf]
[DOI]

Online Meta-Learning for Multi-Source and Semi-Supervised Domain Adaptation
Da Li, Timothy Hospedales
[pdf]
[DOI]

An Ensemble of Epoch-wise Empirical Bayes for Few-shot Learning
Yaoyao Liu, Bernt Schiele, Qianru Sun
[pdf]
[DOI]

On the Effectiveness of Image Rotation for Open Set Domain Adaptation
Silvia Bucci, Mohammad Reza Loghmani, Tatiana Tommasi
[pdf]
[DOI]

Combining Task Predictors via Enhancing Joint Predictability
Kwang In Kim, Christian Richardt, Hyung Jin Chang
[pdf]
[DOI]

Multi-Scale Positive Sample Refinement for Few-Shot Object Detection
Jiaxi Wu, Songtao Liu, Di Huang, Yunhong Wang
[pdf]
[DOI]

Single-Image Depth Prediction Makes Feature Matching Easier
Carl Toft, Daniyar Turmukhambetov, Torsten Sattler, Fredrik Kahl, Gabriel J. Brostow
[pdf]
[DOI]

Deep Reinforced Attention Learning for Quality-Aware Visual Recognition
Duo Li, Qifeng Chen
[pdf]
[DOI]

CFAD: Coarse-to-Fine Action Detector for Spatiotemporal Action Localization
Yuxi Li, Weiyao Lin, John See, Ning Xu Shugong Xu, Ke Yan, Cong Yang
[pdf]
[DOI]

Learning Joint Spatial-Temporal Transformations for Video Inpainting
Yanhong Zeng, Jianlong Fu, Hongyang Chao
[pdf]
[DOI]

Single Path One-Shot Neural Architecture Search with Uniform Sampling
Zichao Guo, Xiangyu Zhang, Haoyuan Mu, Wen Heng, Zechun Liu, Yichen Wei, Jian Sun
[pdf]
[DOI]

Learning to Generate Novel Domains for Domain Generalization
Kaiyang Zhou, Yongxin Yang, Timothy Hospedales, Tao Xiang
[pdf]
[DOI]

Continuous Adaptation for Interactive Object Segmentation by Learning from Corrections
Theodora Kontogianni, Michael Gygli, Jasper Uijlings, Vittorio Ferrari
[pdf]
[DOI]

Impact of base dataset design on few-shot image classification
Othman Sbai, Camille Couprie, Mathieu Aubry
[pdf]
[DOI]

Invertible Zero-Shot Recognition Flows
Yuming Shen, Jie Qin, Lei Huang, Li Liu, Fan Zhu, Ling Shao
[pdf]
[DOI]

GeoLayout: Geometry Driven Room Layout Estimation Based on Depth Maps of Planes
Weidong Zhang, Wei Zhang, Yinda Zhang
[pdf]
[DOI]

Location Sensitive Image Retrieval and Tagging
Raul Gomez, Jaume Gibert, Lluis Gomez, Dimosthenis Karatzas
[pdf]
[DOI]

Joint 3D Layout and Depth Prediction from a Single Indoor Panorama Image
Wei Zeng, Sezer Karaoglu, Theo Gevers
[pdf]
[DOI]

Guessing State Tracking for Visual Dialogue
Wei Pang, Xiaojie Wang
[pdf]
[DOI]

Memory-Efficient Incremental Learning Through Feature Adaptation
Ahmet Iscen, Jeffrey Zhang, Svetlana Lazebnik, Cordelia Schmid
[pdf]
[DOI]

Neural Voice Puppetry: Audio-driven Facial Reenactment
Justus Thies, Mohamed Elgharib, Ayush Tewari, Christian Theobalt, Matthias Nießner
[pdf]
[DOI]

One-Shot Unsupervised Cross-Domain Detection
Antonio D’Innocente, Francesco Cappio Borlino, Silvia Bucci, Barbara Caputo, Tatiana Tommasi
[pdf]
[DOI]

Stochastic Frequency Masking to Improve Super-Resolution and Denoising Networks
Majed El Helou, Ruofan Zhou, Sabine Süsstrunk
[pdf]
[DOI]

Probabilistic Future Prediction for Video Scene Understanding
Anthony Hu, Fergal Cotter, Nikhil Mohan, Corina Gurau, Alex Kendall
[pdf]
[DOI]

Suppressing Mislabeled Data via Grouping and Self-Attention
Xiaojiang Peng, Kai Wang, Zhaoyang Zeng, Qing Li, Jianfei Yang, Yu Qiao
[pdf]
[DOI]

Class-wise Dynamic Graph Convolution for Semantic Segmentation
Hanzhe Hu, Deyi Ji, Weihao Gan, Shuai Bai, Wei Wu, Junjie Yan
[pdf]
[DOI]

Character-Preserving Coherent Story Visualization
Yun-Zhu Song, Zhi Rui Tam, Hung-Jen Chen, Huiao-Han Lu, Hong-Han Shuai
[pdf]
[DOI]

GINet: Graph Interaction Network for Scene Parsing
Tianyi Wu, Yu Lu, Yu Zhu, Chuang Zhang, MingWu, Zhanyu Ma, Guodong Guo
[pdf]
[DOI]

Tensor Low-Rank Reconstruction for Semantic Segmentation
Wanli Chen, Xinge Zhu, Ruoqi Sun, Junjun He, Ruiyu Li, Xiaoyong Shen , Bei Yu
[pdf]
[DOI]

Attentive Normalization
Xilai Li, Wei Sun, Tianfu Wu 
[pdf]
[DOI]

Count- and Similarity-aware R-CNN for Pedestrian Detection
Jin Xie, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan, Yanwei Pang, Ling Shao, Mubarak Shah
[pdf]
[DOI]

TRADI: Tracking Deep Neural network Weight Distributions
Gianni Franchi, Andrei Bursuc, Emanuel Aldea, Séverine Dubuisson, Isabelle Bloch
[pdf]
[DOI]

Spatiotemporal Attacks for Embodied Agents
Aishan Liu, Tairan Huang, Xianglong Liu, Yitao Xu, Yuqing Ma, Xinyun Chen, Stephen J. Maybank, Dacheng Tao
[pdf]
[DOI]

Caption-Supervised Face Recognition: Training a State-of-the-Art Face Model without Manual Annotation
Qingqiu Huang, Lei Yang, Huaiyi Huang, Tong Wu, Dahua Lin
[pdf]
[DOI]

Unselfie: Translating Selfies to Neutral-pose Portraits in the Wild
Liqian Ma, Zhe Lin, Connelly Barnes, Alexei A Efros, Jingwan Lu
[pdf]
[DOI]

Design and Interpretation of Universal Adversarial Patches in Face Detection
Xiao Yang, Fangyun Wei, Hongyang Zhang, Jun Zhu
[pdf]
[DOI]

Few-Shot Object Detection and Viewpoint Estimation for Objects in the Wild
Yang Xiao, Renaud Marlet
[pdf]
[DOI]

Weakly Supervised 3D Hand Pose Estimation via Biomechanical Constraints
Adrian Spurr, Umar Iqbal, Pavlo Molchanov, Otmar Hilliges, Jan Kautz
[pdf]
[DOI]

Dynamic Dual-Attentive Aggregation Learning for Visible-Infrared Person Re-Identification
Mang Ye, Jianbing Shen, David J. Crandall, Ling Shao, Jiebo Luo
[pdf]
[DOI]

Contextual Heterogeneous Graph Network for Human-Object Interaction Detection
Hai Wang, Wei-shi Zheng, Ling Yingbiao
[pdf]
[DOI]

Zero-Shot Image Super-Resolution with Depth Guided Internal Degradation Learning
Xi Cheng, Zhenyong Fu, Jian Yang
[pdf]
[DOI]

A Closest Point Proposal for MCMC-based Probabilistic Surface Registration
Dennis Madsen, Andreas Morel-Forster, Patrick Kahr, Dana Rahbani, Thomas Vetter, Marcel Lüthi
[pdf]
[DOI]

Interactive Video Object Segmentation Using Global and Local Transfer Modules
Yuk Heo, Yeong Jun Koh, Chang-Su Kim
[pdf]
[DOI]

End-to-end Interpretable Learning of Non-blind Image Deblurring
Thomas Eboli, Jian Sun, Jean Ponce
[pdf]
[DOI]

Employing Multi-Estimations for Weakly-Supervised Semantic Segmentation
Junsong Fan, Zhaoxiang Zhang, Tieniu Tan
[pdf]
[DOI]

Learning Noise-Aware Encoder-Decoder from Noisy Labels by Alternating Back-Propagation for Saliency Detection
Jing Zhang, Jianwen Xie, Nick Barnes
[pdf]
[DOI]

Rethinking Image Deraining via Rain Streaks and Vapors
Yinglong Wang, Yibing Song, Chao Ma, Bing Zeng
[pdf]
[DOI]

Finding Non-Uniform Quantization Schemes using Multi-Task Gaussian Processes
Marcelo Gennari do Nascimento, Theo W. Costain, Victor Adrian Prisacariu
[pdf]
[DOI]

Is Sharing of Egocentric Video Giving Away Your Biometric Signature?
Daksh Thapar, Chetan Arora, Aditya Nigam
[pdf]
[DOI]

Captioning Images Taken by People Who Are Blind
Danna Gurari, Yinan Zhao, Meng Zhang, Nilavra Bhattacharya
[pdf]
[DOI]

Improving Semantic Segmentation via Decoupled Body and Edge Supervision
Xiangtai Li, Xia Li, Li Zhang, Guangliang Cheng, Jianping Shi, Zhouchen Lin, Shaohua Tan, Yunhai Tong
[pdf]
[DOI]

Conditional Entropy Coding for Efficient Video Compression
Jerry Liu, Shenlong Wang, Wei-Chiu Ma, Meet Shah, Rui Hu, Pranaab Dhawan, Raquel Urtasun
[pdf]
[DOI]

Differentiable Feature Aggregation Search for Knowledge Distillation
Yushuo Guan, Pengyu Zhao, Bingxuan Wang, Yuanxing Zhang, Cong Yao, Kaigui Bian, Jian Tang
[pdf]
[DOI]

Attention Guided Anomaly Localization in Images
Shashanka Venkataramanan, Kuan-Chuan Peng, Rajat Vikram Singh, Abhijit Mahalanobis
[pdf]
[DOI]

Self-supervised Video Representation Learning by Pace Prediction
Jiangliu Wang, Jianbo Jiao, Yun-Hui Liu
[pdf]
[DOI]

Full-Body Awareness from Partial Observations
Chris Rockwell, David F. Fouhey
[pdf]
[DOI]

Reinforced Axial Refinement Network for Monocular 3D Object Detection
Lijie Liu, Chufan Wu, Jiwen Lu, Lingxi Xie, Jie Zhou, Qi Tian
[pdf]
[DOI]

Self-Supervised Multi-Task Procedure Learning from Instructional Videos
Ehsan Elhamifar, Dat Huynh
[pdf]
[DOI]

CosyPose: Consistent multi-view multi-object 6D pose estimation
Yann Labbé, Justin Carpentier, Mathieu Aubry, Josef Sivic
[pdf]
[DOI]

In-Domain GAN Inversion for Real Image Editing
Jiapeng Zhu, Yujun Shen, Deli Zhao, Bolei Zhou
[pdf]
[DOI]

Key Frame Proposal Network for Efficient Pose Estimation in Videos
Yuexi Zhang, Yin Wang, Octavia Camps, Mario Sznaier
[pdf]
[DOI]

Exchangeable Deep Neural Networks for Set-to-Set Matching and Learning
Yuki Saito, Takuma Nakamura, Hirotaka Hachiya, Kenji Fukumizu
[pdf]
[DOI]

Making Sense of CNNs: Interpreting Deep Representations & Their Invariances with INNs
Robin Rombach, Patrick Esser, Björn Ommer
[pdf]
[DOI]

Cross-Modal Weighting Network for RGB-D Salient Object Detection
Gongyang Li, Zhi Liu, Linwei Ye, Yang Wang, Haibin Ling
[pdf]
[DOI]

Open-set Adversarial Defense
Rui Shao, Pramuditha Perera, Pong C. Yuen, Vishal M. Patel
[pdf]
[DOI]

Deep Image Compression using Decoder Side Information
Sharon Ayzik, Shai Avidan
[pdf]
[DOI]

Meta-Sim2: Unsupervised Learning of Scene Structure for Synthetic Data Generation
Jeevan Devaranjan, Amlan Kar, Sanja Fidler
[pdf]
[DOI]

A Generic Visualization Approach for Convolutional Neural Networks
Ahmed Taha, Xitong Yang, Abhinav Shrivastava, Larry Davis
[pdf]
[DOI]

Interactive Annotation of 3D Object Geometry using 2D Scribbles
Tianchang Shen, Jun Gao, Amlan Kar, Sanja Fidler
[pdf]
[DOI]

Hierarchical Kinematic Human Mesh Recovery
Georgios Georgakis, Ren Li, Srikrishna Karanam, Terrence Chen, Jana Košecká, Ziyan Wu
[pdf]
[DOI]

Multi-Loss Rebalancing Algorithm for Monocular Depth Estimation
Jae-Han Lee, Chang-Su Kim
[pdf]
[DOI]

3D Bird Reconstruction: a Dataset, Model, and Shape Recovery from a Single View
Marc Badger, Yufu Wang, Adarsh Modh, Ammon Perkes, Nikos Kolotouros , Bernd G. Pfrommer, Marc F. Schmidt, Kostas Daniilidis
[pdf]
[DOI]

We Have So Much In Common: Modeling Semantic Relational Set Abstractions in Videos
Alex Andonian, Camilo Fosco, Mathew Monfort, Allen Lee, Rogerio Feris, Carl Vondrick, Aude Oliva
[pdf]
[DOI]

Joint Optimization for Multi-Person Shape Models from Markerless 3D-Scans
Samuel Zeitvogel, Johannes Dornheim, Astrid Laubenheimer
[pdf]
[DOI]

Accurate RGB-D Salient Object Detection via Collaborative Learning
Wei Ji, Jingjing Li, Miao Zhang, Yongri Piao, Huchuan Lu
[pdf]
[DOI]

Finding Your (3D) Center: 3D Object Detection Using a Learned Loss
David Griffiths, Jan Boehm, Tobias Ritschel
[pdf]
[DOI]

Collaborative Training between Region Proposal Localization and Classification for Domain Adaptive Object Detection
Ganlong Zhao, Guanbin Li, Ruijia Xu, Liang Lin
[pdf]
[DOI]

Two Stream Active Query Suggestion for Active Learning in Connectomics
Zudi Lin, Donglai Wei, Won-Dong Jang, Siyan Zhou, Xupeng Chen, Xueying Wang, Richard Schalek, Daniel Berger, Brian Matejek, Lee Kamentsky, Adi Peleg, Daniel Haehn, Thouis Jones, Toufiq Parag, Jeff Lichtman, Hanspeter Pfister
[pdf]
[DOI]

Pix2Surf: Learning Parametric 3D Surface Models of Objects from Images
Jiahui Lei, Srinath Sridhar, Paul Guerrero, Minhyuk Sung, Niloy Mitra, Leonidas J. Guibas
[pdf]
[DOI]

6D Camera Relocalization in Ambiguous Scenes via Continuous Multimodal Inference
Mai Bui, Tolga Birdal, Haowen Deng, Shadi Albarqouni, Leonidas Guibas, Slobodan Ilic, Nassir Navab
[pdf]
[DOI]

Modeling Artistic Workflows for Image Generation and Editing
Hung-Yu Tseng, Matthew Fisher, Jingwan Lu, Yijun Li, Vladimir Kim, Ming-Hsuan Yang
[pdf]
[DOI]

A Large-scale Annotated Mechanical Components Benchmark for Classification and Retrieval Tasks with Deep Neural Networks
Sangpil Kim, Hyung-gun Chi, Xiao Hu, Qixing Huang, Karthik Ramani
[pdf]
[DOI]

Hidden Footprints: Learning Contextual Walkability from 3D Human Trails
Jin Sun, Hadar Averbuch-Elor, Qianqian Wang, Noah Snavely
[pdf]
[DOI]

Self-Supervised Learning of Audio-Visual Objects from Video
Triantafyllos Afouras, Andrew Owens, Joon Son Chung, Andrew Zisserman
[pdf]
[DOI]

GAN-based Garment Generation Using Sewing Pattern Images
Yu Shen, Junbang Liang, Ming C. Lin
[pdf]
[DOI]

Style Transfer for Co-Speech Gesture Animation: A Multi-Speaker Conditional-Mixture Approach
Chaitanya Ahuja, Dong Won Lee, Yukiko I. Nakano, Louis-Philippe Morency
[pdf]
[DOI]

An LSTM Approach to Temporal 3D Object Detection in LiDAR Point Clouds
Rui Huang, Wanyue Zhang, Abhijit Kundu, Caroline Pantofaru, David A Ross, Thomas Funkhouser, Alireza Fathi
[pdf]
[DOI]

Monotonicity Prior for Cloud Tomography
Tamar Loeub, Aviad Levis, Vadim Holodovsky, Yoav Y. Schechner
[pdf]
[DOI]

Learning Trailer Moments in Full-Length Movies with Co-Contrastive Attention
Lezi Wang, Dong Liu, Rohit Puri, Dimitris N. Metaxas
[pdf]
[DOI]

Preserving Semantic Neighborhoods for Robust Cross-modal Retrieval
Christopher Thomas, Adriana Kovashka
[pdf]
[DOI]

Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art Baseline
Vishvak Murahari, Dhruv Batra, Devi Parikh, Abhishek Das
[pdf]
[DOI]

Learning to Generate Grounded Visual Captions without Localization Supervision
Chih-Yao Ma, Yannis Kalantidis, Ghassan AlRegib, Peter Vajda, Marcus Rohrbach, Zsolt Kira
[pdf]
[DOI]

Neural Hair Rendering
Menglei Chai, Jian Ren, Sergey Tulyakov
[pdf]
[DOI]

JNR: Joint-based Neural Rig Representation for Compact 3D Face Modeling
Noranart Vesdapunt, Mitch Rundle, HsiangTao Wu, Baoyuan Wang
[pdf]
[DOI]

On Disentangling Spoof Trace for Generic Face Anti-Spoofing
Yaojie Liu, Joel Stehouwer, Xiaoming Liu
[pdf]
[DOI]

Streaming Object Detection for 3-D Point Clouds
Wei Han, Zhengdong Zhang, Benjamin Caine, Brandon Yang, Christoph Sprunk, Ouais Alsharif, Jiquan Ngiam, Vijay Vasudevan, Jonathon Shlens, Zhifeng Chen
[pdf]
[DOI]

NAS-DIP: Learning Deep Image Prior with Neural Architecture Search
Yun-Chun Chen, Chen Gao, Esther Robb, Jia-Bin Huang
[pdf]
[DOI]

Learning to Learn in a Semi-Supervised Fashion
Yun-Chun Chen, Chao-Te Chou, Yu-Chiang Frank Wang
[pdf]
[DOI]

FeatMatch: Feature-Based Augmentation for Semi-Supervised Learning
Chia-Wen Kuo, Chih-Yao Ma, Jia-Bin Huang, Zsolt Kira
[pdf]
[DOI]

RadarNet: Exploiting Radar for Robust Perception of Dynamic Objects
Bin Yang, Runsheng Guo, Ming Liang, Sergio Casas, Raquel Urtasun
[pdf]
[DOI]

Seeing the Un-Scene: Learning Amodal Semantic Maps for Room Navigation
Medhini Narasimhan, Erik Wijmans, Xinlei Chen, Trevor Darrell, Dhruv Batra, Devi Parikh, Amanpreet Singh
[pdf]
[DOI]

Learning to Separate: Detecting Heavily-Occluded Objects in Urban Scenes
Chenhongyi Yang, Vitaly Ablavsky, Kaihong Wang, Qi Feng, Margrit Betke
[pdf]
[DOI]

Towards causal benchmarking of bias in face analysis algorithms
Guha Balakrishnan, Yuanjun Xiong, Wei Xia, Pietro Perona
[pdf]
[DOI]

Learning and Memorizing Representative Prototypes for 3D Point Cloud Semantic and Instance Segmentation
Tong He, Dong Gong, Zhi Tian, Chunhua Shen
[pdf]
[DOI]

Knowledge-Based Video Question Answering with Unsupervised Scene Descriptions
Noa Garcia, Yuta Nakashima
[pdf]
[DOI]

Transformation Consistency Regularization – A Semi-Supervised Paradigm for Image-to-Image Translation
Aamir Mustafa, Rafal K. Mantiuk
[pdf]
[DOI]

LIRA: Lifelong Image Restoration from Unknown Blended Distortions
Jianzhao Liu, Jianxin Lin, Xin Li, Wei Zhou, Sen Liu, Zhibo Chen
[pdf]
[DOI]

HDNet: Human Depth Estimation for Multi-Person Camera-Space Localization
Jiahao Lin, Gim Hee Lee
[pdf]
[DOI]

SOLO: Segmenting Objects by Locations
Xinlong Wang, Tao Kong, Chunhua Shen, Yuning Jiang, Lei Li
[pdf]
[DOI]

Learning to See in the Dark with Events
Song Zhang, Yu Zhang, Zhe Jiang, Dongqing Zou, Jimmy Ren, Bin Zhou
[pdf]
[DOI]

Trajectron++: Dynamically-Feasible Trajectory Forecasting With Heterogeneous Data
Tim Salzmann, Boris Ivanovic, Punarjay Chakravarty, Marco Pavone
[pdf]
[DOI]

Context-Gated Convolution
Xudong Lin, Lin Ma, Wei Liu, Shih-Fu Chang
[pdf]
[DOI]

Polynomial Regression Network for Variable-Number Lane Detection
Bingke Wang, Zilei Wang, Yixin Zhang
[pdf]
[DOI]

Structural Deep Metric Learning for Room Layout Estimation
Wenzhao Zheng, Jiwen Lu, Jie Zhou
[pdf]
[DOI]

Adaptive Task Sampling for Meta-Learning
Chenghao Liu, Zhihao Wang, Doyen Sahoo, Yuan Fang Kun Zhang, Steven C.H. Hoi
[pdf]
[DOI]

Deep Complementary Joint Model for Complex Scene Registration and Few-shot Segmentation on Medical Images
Yuting He, Tiantian Li, Guanyu Yang, Youyong Kong, Yang Chen, Huazhong Shu, Jean-Louis Coatrieux, Jean-Louis Dillenseger, Shuo Li
[pdf]
[DOI]

Improving Multispectral Pedestrian Detection by Addressing Modality Imbalance Problems
Kailai Zhou, Linsen Chen, Xun Cao
[pdf]
[DOI]

High-Resolution Image Inpainting with Iterative Confidence Feedback and Guided Upsampling
Yu Zeng, Zhe Lin, Jimei Yang, Jianming Zhang, Eli Shechtman, Huchuan Lu
[pdf]
[DOI]

Online Ensemble Model Compression using Knowledge Distillation
Devesh Walawalkar, Zhiqiang Shen, Marios Savvides
[pdf]
[DOI]

Deep Learning-based Pupil Center Detection for Fast and Accurate Eye Tracking System
Kang Il Lee, Jung Ho Jeon, Byung Cheol Song
[pdf]
[DOI]

Efficient Residue Number System Based Winograd Convolution
Zhi-Gang Liu, Matthew Mattina
[pdf]
[DOI]

Robust Tracking against Adversarial Attacks
Shuai Jia, Chao Ma, Yibing Song, Xiaokang Yang
[pdf]
[DOI]

Single-Shot Neural Relighting and SVBRDF Estimation
Shen Sang, Manmohan Chandraker
[pdf]
[DOI]

Unsupervised 3D Human Pose Representation with Viewpoint and Pose Disentanglement
Qiang Nie , Ziwei Liu , Yunhui Liu 
[pdf]
[DOI]

Angle-based Search Space Shrinking for Neural Architecture Search
Yiming Hu, Yuding Liang, Zichao Guo, Ruosi Wan, Xiangyu Zhang, Yichen Wei, Qingyi Gu, Jian Sun
[pdf]
[DOI]

RobustScanner: Dynamically Enhancing Positional Clues for Robust Text Recognition
Xiaoyu Yue, Zhanghui Kuang, Chenhao Lin, Hongbin Sun, Wayne Zhang
[pdf]
[DOI]

Towards Fast, Accurate and Stable 3D Dense Face Alignment
Jianzhu Guo, Xiangyu Zhu, Yang Yang, Fan Yang, Zhen Lei, Stan Z. Li
[pdf]
[DOI]

Iterative Feature Transformation for Fast and Versatile Universal Style Transfer
Tai-Yin Chiu, Danna Gurari
[pdf]
[DOI]

CATCH: Context-based Meta Reinforcement Learning for Transferrable Architecture Search
Xin Chen, Yawen Duan, Zewei Chen, Hang Xu, Zihao Chen, Xiaodan Liang, Tong Zhang, Zhenguo Li
[pdf]
[DOI]

Toward Faster and Simpler Matrix Normalization via Rank-1 Update
Tan Yu, Yunfeng Cai, Ping Li
[pdf]
[DOI]

Accurate Polarimetric BRDF for Real Polarization Scene Rendering
Yuhi Kondo, Taishi Ono, Legong Sun, Yasutaka Hirasawa, Jun Murayama
[pdf]
[DOI]

Lensless Imaging with Focusing Sparse URA Masks in Long-Wave Infrared and its Application for Human Detection
Ilya Reshetouski, Hideki Oyaizu, Kenichiro Nakamura, Ryuta Satoh, Suguru Ushiki, Ryuichi Tadano, Atsushi Ito, Jun Murayama
[pdf]
[DOI]

Topology-Preserving Class-Incremental Learning
Xiaoyu Tao, Xinyuan Chang, Xiaopeng Hong, Xing Wei, Yihong Gong
[pdf]
[DOI]

Inter-Image Communication for Weakly Supervised Localization
Xiaolin Zhang, Yunchao Wei, Yi Yang
[pdf]
[DOI]

UFO²: A Unified Framework towards Omni-supervised Object Detection
Zhongzheng Ren, Zhiding Yu, Xiaodong Yang, Ming-Yu Liu, Alexander G. Schwing, Jan Kautz
[pdf]
[DOI]

iCaps: An Interpretable Classifier via Disentangled Capsule Networks
Dahuin Jung, Jonghyun Lee, Jihun Yi, Sungroh Yoon
[pdf]
[DOI]

Detecting Natural Disasters, Damage, and Incidents in the Wild
Ethan Weber, Nuria Marzo, Dim P. Papadopoulos, Aritro Biswas, Agata Lapedriza, Ferda Ofli, Muhammad Imran, Antonio Torralba
[pdf]
[DOI]

Dynamic ReLU
Yinpeng Chen, Xiyang Dai, Mengchen Liu, Dongdong Chen, Lu Yuan, Zicheng Liu
[pdf]
[DOI]

Acquiring Dynamic Light Fields through Coded Aperture Camera
Kohei Sakai, Keita Takahashi, Toshiaki Fujii, Hajime Nagahara
[pdf]
[DOI]

Gait Recognition from a Single Image using a Phase-Aware Gait Cycle Reconstruction Network
Chi Xu, Yasushi Makihara, Xiang Li, Yasushi Yagi, Jianfeng Lu
[pdf]
[DOI]

Informative Sample Mining Network for Multi-Domain Image-to-Image Translation
Jie Cao, Huaibo Huang, Yi Li, Ran He, Zhenan Sun
[pdf]
[DOI]

Spherical Feature Transform for Deep Metric Learning
Yuke Zhu, Yan Bai, Yichen Wei
[pdf]
[DOI]

Semantic Equivalent Adversarial Data Augmentation for Visual Question Answering
Ruixue Tang, Chao Ma, Wei Emma Zhang, Qi Wu, Xiaokang Yang
[pdf]
[DOI]

Unsupervised Multi-View CNN for Salient View Selection of 3D Objects and Scenes
Ran Song, Wei Zhang, Yitian Zhao, Yonghuai Liu
[pdf]
[DOI]

Representation Sharing for Fast Object Detector Search and Beyond
Yujie Zhong, Zelu Deng, Sheng Guo, Matthew R. Scott, Weilin Huang
[pdf]
[DOI]

Peeking into occluded joints: A novel framework for crowd pose estimation
Lingteng Qiu, Xuanye Zhang, Yanran Li, Guanbin Li, Xiaojun Wu, Zixiang Xiong, Xiaoguang Han, Shuguang Cui
[pdf]
[DOI]

RubiksNet: Learnable 3D-Shift for Efficient Video Action Recognition
Linxi Fan, Shyamal Buch, Guanzhi Wang, Ryan Cao, Yuke Zhu, Juan Carlos Niebles, Li Fei-Fei
[pdf]
[DOI]

Deep Hashing with Active Pairwise Supervision
Ziwei Wang, Quan Zheng, Jiwen Lu, Jie Zhou
[pdf]
[DOI]

Graph Edit Distance Reward: Learning to Edit Scene Graph
Lichang Chen, Guosheng Lin, Shijie Wang, Qingyao Wu
[pdf]
[DOI]

Malleable 2.5D Convolution: Learning Receptive Fields along the Depth-axis for RGB-D Scene Parsing
Yajie Xing, Jingbo Wang, Gang Zeng
[pdf]
[DOI]

Feature-metric Loss for Self-supervised Learning of Depth and Egomotion
Chang Shu, Kun Yu, Zhixiang Duan, Kuiyuan Yang
[pdf]
[DOI]

Propagating Over Phrase Relations for One-Stage Visual Grounding
Sibei Yang, Guanbin Li, Yizhou Yu
[pdf]
[DOI]

Adversarial Semantic Data Augmentation for Human Pose Estimation
Yanrui Bin, Xuan Cao, Xinya Chen, Yanhao Ge, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang, Changxin Gao, Nong Sang
[pdf]
[DOI]

Free View Synthesis
Gernot Riegler, Vladlen Koltun
[pdf]
[DOI]

Face Anti-Spoofing via Disentangled Representation Learning
Ke-Yue Zhang, Taiping Yao, Jian Zhang, Ying Tai, Shouhong Ding, Jilin Li, Feiyue Huang, Haichuan Song, Lizhuang Ma
[pdf]
[DOI]

Prime-Aware Adaptive Distillation
Youcai Zhang, Zhonghao Lan, Yuchen Dai, Fangao Zeng, Yan Bai, Jie Chang, Yichen Wei
[pdf]
[DOI]

Meta-Learning with Network Pruning
Hongduan Tian, Bo Liu, Xiao-Tong Yuan, Qingshan Liu
[pdf]
[DOI]

Spiral Generative Network for Image Extrapolation
Dongsheng Guo, Hongzhi Liu, Haoru Zhao, Yunhao Cheng, Qingwei Song, Zhaorui Gu, Haiyong Zheng, Bing Zheng
[pdf]
[DOI]

SceneSketcher: Fine-Grained Image Retrieval with Scene Sketches
Fang Liu, Changqing Zou, Xiaoming Deng, Ran Zuo, Yu-Kun Lai, Cuixia Ma, Yong-Jin Liu, Hongan Wang
[pdf]
[DOI]

Few-shot Compositional Font Generation with Dual Memory
Junbum Cha, Sanghyuk Chun, Gayoung Lee, Bado Lee, Seonghyeon Kim, Hwalsuk Lee
[pdf]
[DOI]

PUGeo-Net: A Geometry-centric Network for 3D Point Cloud Upsampling
Yue Qian, Junhui Hou, Sam Kwong, Ying He
[pdf]
[DOI]

Handcrafted Outlier Detection Revisited
Luca Cavalli, Viktor Larsson, Martin Ralf Oswald, Torsten Sattler, Marc Pollefeys
[pdf]
[DOI]

The Average Mixing Kernel Signature
Luca Cosmo, Giorgia Minello, Michael Bronstein, Luca Rossi, Andrea Torsello
[pdf]
[DOI]

BCNet: Learning Body and Cloth Shape from A Single Image
Boyi Jiang, Juyong Zhang, Yang Hong, Jinhao Luo, Ligang Liu, Hujun Bao
[pdf]
[DOI]

Self-supervised Keypoint Correspondences for Multi-Person Pose Estimation and Tracking in Videos
Umer Rafi, Andreas Doering, Bastian Leibe, Juergen Gall
[pdf]
[DOI]

Interactive Multi-Dimension Modulation with Dynamic Controllable Residual Learning for Image Restoration
Jingwen He, Chao Dong, Yu Qiao
[pdf]
[DOI]

Polysemy Deciphering Network for Human-Object Interaction Detection
Xubin Zhong, Changxing Ding, Xian Qu, Dacheng Tao
[pdf]
[DOI]

PODNet: Pooled Outputs Distillation for Small-Tasks Incremental Learning
Arthur Douillard, Matthieu Cord, Charles Ollion, Thomas Robert, Eduardo Valle
[pdf]
[DOI]

Learning Graph-Convolutional Representations for Point Cloud Denoising
Francesca Pistilli, Giulia Fracastoro, Diego Valsesia, Enrico Magli
[pdf]
[DOI]

Semantic Line Detection Using Mirror Attention and Comparative Ranking and Matching
Dongkwon Jin, Jun-Tae Lee,  Chang-Su Kim
[pdf]
[DOI]

A Differentiable Recurrent Surface for Asynchronous Event-Based Data
Marco Cannici, Marco Ciccone, Andrea Romanoni , Matteo Matteucci
[pdf]
[DOI]

Fine-Grained Visual Classification via Progressive Multi-Granularity Training of Jigsaw Patches
Ruoyi Du, Dongliang Chang, Ayan Kumar Bhunia, Jiyang Xie, Zhanyu Ma , Yi-Zhe Song, Jun Guo
[pdf]
[DOI]

LiteFlowNet3: Resolving Correspondence Ambiguity for More Accurate Optical Flow Estimation
Tak-Wai Hui, Chen Change Loy
[pdf]
[DOI]

Microscopy Image Restoration with Deep Wiener-Kolmogorov Filters
Valeriya Pronina, Filippos Kokkinos, Dmitry V. Dylov, Stamatios Lefkimmiatis
[pdf]
[DOI]

ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language
Dave Zhenyu Chen, Angel X. Chang, Matthias Nießner
[pdf]
[DOI]

JSENet: Joint Semantic Segmentation and Edge Detection Network for 3D Point Clouds
Zeyu Hu, Mingmin Zhen, Xuyang Bai, Hongbo Fu, Chiew-lan Tai
[pdf]
[DOI]

Motion-Excited Sampler: Video Adversarial Attack with Sparked Prior
Hu Zhang, Linchao Zhu, Yi Zhu, Yi Yang
[pdf]
[DOI]

An Inference Algorithm for Multi-Label MRF-MAP Problems with Clique Size 100
Ishant Shanu, Siddhant Bharti, Chetan Arora, S. N. Maheshwari
[pdf]
[DOI]

Dual Refinement Underwater Object Detection Network
Baojie Fan, Wei Chen, Yang Cong, Jiandong Tian
[pdf]
[DOI]

Multiple Sound Sources Localization from Coarse to Fine
Rui Qian, Di Hu, Heinrich Dinkel, Mengyue Wu, Ning Xu, Weiyao Lin
[pdf]
[DOI]

Task-Aware Quantization Network for JPEG Image Compression
Jinyoung Choi, Bohyung Han
[pdf]
[DOI]

Energy-Based Models for Deep Probabilistic Regression
Fredrik K. Gustafsson, Martin Danelljan, Goutam Bhat, Thomas B. Schön
[pdf]
[DOI]

CLOTH3D: Clothed 3D Humans
Hugo Bertiche, Meysam Madadi, Sergio Escalera
[pdf]
[DOI]

Encoding Structure-Texture Relation with P-Net for Anomaly Detection in Retinal Images
Kang Zhou, Yuting Xiao, Jianlong Yang, Jun Cheng, Wen Liu, Weixin Luo, Zaiwang Gu, Jiang Liu, Shenghua Gao
[pdf]
[DOI]

CLNet: A Compact Latent Network for Fast Adjusting Siamese Trackers
Xingping Dong, Jianbing Shen, Ling Shao, Fatih Porikli
[pdf]
[DOI]

Occlusion-Aware Siamese Network for Human Pose Estimation
Lu Zhou, Yingying Chen, Yunze Gao, Jinqiao Wang, Hanqing Lu
[pdf]
[DOI]

Learning to Predict Salient Faces: A Novel Visual-Audio Saliency Model
Yufan Liu, Minglang Qiao, Mai Xu, Bing Li, Weiming Hu, Ali Borji
[pdf]
[DOI]

NormalGAN: Learning Detailed 3D Human from a Single RGB-D Image
Lizhen Wang, Xiaochen Zhao, Tao Yu, Songtao Wang, Yebin Liu
[pdf]
[DOI]

Model-based occlusion disentanglement for image-to-image translation
Fabio Pizzati, Pietro Cerri, Raoul de Charette
[pdf]
[DOI]

Rotation-robust Intersection over Union for 3D Object Detection
Yu Zheng, Danyang Zhang, Sinan Xie, Jiwen Lu, Jie Zhou
[pdf]
[DOI]

New Threats against Object Detector with Non-local Block
Yi Huang, Fan Wang, Adams Wai-Kin Kong, Kwok-Yan Lam
[pdf]
[DOI]

Self-Supervised CycleGAN for Object-Preserving Image-to-Image Domain Adaptation
Xinpeng Xie, Jiawei Chen, Yuexiang Li, Linlin Shen, Kai Ma, Yefeng Zheng
[pdf]
[DOI]

On the Usage of the Trifocal Tensor in Motion Segmentation
Federica Arrigoni, Luca Magri, Tomas Pajdla
[pdf]
[DOI]

3D-Rotation-Equivariant Quaternion Neural Networks
Wen Shen, Binbin Zhang, Shikun Huang, Zhihua Wei, Quanshi Zhang
[pdf]
[DOI]

InterHand2.6M: A Dataset and Baseline for 3D Interacting Hand Pose Estimation from a Single RGB Image
Gyeongsik Moon, Shoou-I Yu, He Wen, Takaaki Shiratori, Kyoung Mu Lee
[pdf]
[DOI]

Active Crowd Counting with Limited Supervision
Zhen Zhao, Miaojing Shi, Xiaoxiao Zhao, Li Li
[pdf]
[DOI]

Self-Supervised Monocular Depth Estimation: Solving the Dynamic Object Problem by Semantic Guidance
Marvin Klingner, Jan-Aike Termhlen, Jonas Mikolajczyk, Tim Fingscheidt
[pdf]
[DOI]

Hierarchical Visual-Textual Graph for Temporal Activity Localization via Language
Shaoxiang Chen, Yu-Gang Jiang
[pdf]
[DOI]

Do Not Mask What You Do Not Need to Mask: a Parser-Free Virtual Try-On
Thibaut Issenhuth, Jérémie Mary, Clément Calauzènes
[pdf]
[DOI]

NODIS: Neural Ordinary Differential Scene Understanding
Yuren Cong, Hanno Ackermann, Wentong Liao, Michael Ying Yang, Bodo Rosenhahn
[pdf]
[DOI]

AssembleNet++: Assembling Modality Representations via Attention Connections - Supplementary Material -
Michael S. Ryoo, AJ Piergiovanni, Juhana Kangaspunta, Anelia Angelova
[pdf]
[DOI]

Learning Propagation Rules for Attribution Map Generation
Yiding Yang, Jiayan Qiu, Mingli Song, Dacheng Tao, Xinchao Wang
[pdf]
[DOI]

Reparameterizing Convolutions for Incremental Multi-Task Learning without Task Interference
Menelaos Kanakis, David Bruggemann, Suman Saha, Stamatios Georgoulis , Anton Obukhov, Luc Van Gool
[pdf]
[DOI]

Learning Predictive Models from Observation and Interaction
Karl Schmeckpeper, Annie Xie, Oleh Rybkin, Stephen Tian, Kostas Daniilidis, Sergey Levine, Chelsea Finn
[pdf]
[DOI]

Unifying Deep Local and Global Features for Image Search
Bingyi Cao, André Araujo, Jack Sim
[pdf]
[DOI]

Human Body Model Fitting by Learned Gradient Descent
Jie Song, Xu Chen, Otmar Hilliges
[pdf]
[DOI]

DDGCN: A Dynamic Directed Graph Convolutional Network for Action Recognition
Matthew Korban, Xin Li
[pdf]
[DOI]

Learning latent representations across multiple data domains using Lifelong VAEGAN
Fei Ye, Adrian G. Bors
[pdf]
[DOI]

DVI: Depth Guided Video Inpainting for Autonomous Driving
Miao Liao, Feixiang Lu, Dingfu Zhou, Sibo Zhang, Wei Li, Ruigang Yang
[pdf]
[DOI]

Incorporating Reinforced Adversarial Learning in Autoregressive Image Generation
Kenan E. Ak, Ning Xu, Zhe Lin, Yilin Wang
[pdf]
[DOI]

APRICOT: A Dataset of Physical Adversarial Attacks on Object Detection
A. Braunegg, Amartya Chakraborty, Michael Krumdick, Nicole Lape, Sara Leary, Keith Manville, Elizabeth Merkhofer, Laura Strickhart, Matthew Walmer
[pdf]
[DOI]

Visual Question Answering on Image Sets
Ankan Bansal, Yuting Zhang, Rama Chellappa
[pdf]
[DOI]

Object as Hotspots: An Anchor-Free 3D Object Detection Approach via Firing of Hotspots
Qi Chen, Lin Sun, Zhixin Wang, Kui Jia, Alan Yuille
[pdf]
[DOI]

Placepedia: Comprehensive Place Understanding with Multi-Faceted Annotations
Huaiyi Huang, Yuqi Zhang, Qingqiu Huang, Zhengkui Guo, Ziwei Liu, Dahua Lin
[pdf]
[DOI]

DELTAS: Depth Estimation by Learning Triangulation And densification of Sparse points
Ayan Sinha, Zak Murez, James Bartolozzi, Vijay Badrinarayanan, Andrew Rabinovich
[pdf]
[DOI]

Dynamic Low-light Imaging with Quanta Image Sensors
Yiheng Chi, Abhiram Gnanasambandam, Vladlen Koltun, Stanley H. Chan
[pdf]
[DOI]

Disambiguating Monocular Depth Estimation with a Single Transient
Mark Nishimura, David B. Lindell, Christopher Metzler, Gordon Wetzstein
[pdf]
[DOI]

DSDNet: Deep Structured self-Driving Network
Wenyuan Zeng, Shenlong Wang, Renjie Liao, Yun Chen, Bin Yang, Raquel Urtasun
[pdf]
[DOI]

QuEST: Quantized Embedding Space for Transferring Knowledge
Himalaya Jain, Spyros Gidaris, Nikos Komodakis, Patrick Pérez, Matthieu Cord
[pdf]
[DOI]

EGDCL: An Adaptive Curriculum Learning Framework for Unbiased Glaucoma Diagnosis
Rongchang Zhao, Xuanlin Chen, Zailiang Chen, Shuo Li
[pdf]
[DOI]

Backpropagated Gradient Representations for Anomaly Detection
Gukyeong Kwon, Mohit Prabhushankar, Dogancan Temel, Ghassan AlRegib
[pdf]
[DOI]

Dense RepPoints: Representing Visual Objects with Dense Point Sets
Ze Yang, Yinghao Xu, Han Xue, Zheng Zhang Raquel Urtasun, Liwei Wang , Stephen Lin, Han Hu
[pdf]
[DOI]

On Dropping Clusters to Regularize Graph Convolutional Neural Networks
Xikun Zhang, Chang Xu, Dacheng Tao
[pdf]
[DOI]

Adaptive Video Highlight Detection by Learning from User History
Mrigank Rochan, Mahesh Kumar Krishna Reddy, Linwei Ye, Yang Wang
[pdf]
[DOI]

Improving 3D Object Detection through Progressive Population Based Augmentation
Shuyang Cheng, Zhaoqi Leng, Ekin Dogus Cubuk, Barret Zoph, Chunyan Bai, Jiquan Ngiam, Yang Song, Benjamin Caine, Vijay Vasudevan, Congcong Li, Quoc V. Le, Jonathon Shlens, Dragomir Anguelov
[pdf]
[DOI]

DR-KFS: A Differentiable Visual Similarity Metric for 3D Shape Reconstruction
Jiongchao Jin, Akshay Gadi Patil, Zhang Xiong, Hao Zhang
[pdf]
[DOI]

SPAN: Spatial Pyramid Attention Network for Image Manipulation Localization
Xuefeng Hu, Zhihan Zhang, Zhenye Jiang, Syomantak Chaudhuri, Zhenheng Yang, Ram Nevatia
[pdf]
[DOI]

Adversarial Learning for Zero-shot Domain Adaptation
Jinghua Wang, Jianmin Jiang
[pdf]
[DOI]

YOLO in the Dark - Domain Adaptation Method for Merging Multiple Models -
Yukihiro Sasagawa, Hajime Nagahara        
[pdf]
[DOI]

Identity-Aware Multi-Sentence Video Description
Jae Sung Park, Trevor Darrell, Anna Rohrbach
[pdf]
[DOI]

VQA-LOL: Visual Question Answering under the Lens of Logic
Tejas Gokhale, Pratyay Banerjee, Chitta Baral, Yezhou Yang
[pdf]
[DOI]

Piggyback GAN: Efficient Lifelong Learning for Image Conditioned Generation
Mengyao Zhai, Lei Chen, Jiawei He, Megha Nawhal, Frederick Tung, Greg Mori
[pdf]
[DOI]

TRRNet: Tiered Relation Reasoning for Compositional Visual Question Answering
Xiaofeng Yang, Guosheng Lin, Fengmao Lv, Fayao Liu
[pdf]
[DOI]

Mining Inter-Video Proposal Relations for Video Object Detection
Mingfei Han, Yali Wang, Xiaojun Chang, Yu Qiao
[pdf]
[DOI]

TVR: A Large-Scale Dataset for Video-Subtitle Moment Retrieval
Jie Lei, Licheng Yu, Tamara L. Berg, Mohit Bansal
[pdf]
[DOI]

Minimum Class Confusion for Versatile Domain Adaptation
Ying Jin, Ximei Wang, Mingsheng Long(), Jianmin Wang
[pdf]
[DOI]

Large Batch Optimization for Object Detection: Training COCO in 12 Minutes
Tong Wang, Yousong Zhu, Chaoyang Zhao, Wei Zeng, Yaowei Wang, Jinqiao Wang, Ming Tang
[pdf]
[DOI]

Towards Practical and Efficient High-Resolution HDR Deghosting with CNN
K. Ram Prabhakar, Susmit Agrawal, Durgesh Kumar Singh, Balraj Ashwath , R. Venkatesh Babu
[pdf]
[DOI]

Monocular Differentiable Rendering for Self-Supervised 3D Object Detection
Deniz Beker, Hiroharu Kato, Mihai Adrian Morariu, Takahiro Ando, Toru Matsuoka, Wadim Kehl, Adrien Gaidon
[pdf]
[DOI]

Shape Prior Deformation for Categorical 6D Object Pose and Size Estimation
Meng Tian, Marcelo H Ang Jr, Gim Hee Lee
[pdf]
[DOI]

Dynamic and Static Context-aware LSTM for Multi-agent Motion Prediction
Chaofan Tao, Qinhong Jiang, Lixin Duan, Ping Luo
[pdf]
[DOI]

Image-based table recognition: data, model, and evaluation
Xu Zhong, Elaheh ShafieiBavani, Antonio Jimeno Yepes
[pdf]
[DOI]

Group Activity Prediction with Sequential Relational Anticipation Model
Junwen Chen, Wentao Bao,, Yu Kong
[pdf]
[DOI]

PiP: Planning-informed Trajectory Prediction for Autonomous Driving
Haoran Song, Wenchao Ding, Yuxuan Chen, Shaojie Shen, Michael Yu Wang, Qifeng Chen
[pdf]
[DOI]

PSConv: Squeezing Feature Pyramid into One Compact Poly-Scale Convolutional Layer
Duo Li, Anbang Yao, Qifeng Chen
[pdf]
[DOI]

Hierarchical Context Embedding for Region-based Object Detection
Zhao-Min Chen, Xin Jin, Borui Zhao, Xiu-Shen Wei, Yanwen Guo
[pdf]
[DOI]

Attention-Driven Dynamic Graph Convolutional Network for Multi-Label Image Recognition
Jin Ye, Junjun He, Xiaojiang Peng, Wenhao Wu, Yu Qiao
[pdf]
[DOI]

Gen-LaneNet: A Generalized and Scalable Approach for 3D Lane Detection
Yuliang Guo, Guang Chen, Peitao Zhao, Weide Zhang, Jinghao Miao, Jingao Wang, Tae Eun Choe
[pdf]
[DOI]

Sparse-to-Dense Depth Completion Revisited: Sampling Strategy and Graph Construction
Xin Xiong, Haipeng Xiong, Ke Xian, Chen Zhao, Zhiguo Cao, Xin Li
[pdf]
[DOI]

MEAD: A Large-scale Audio-visual Dataset for Emotional Talking-face Generation
Kaisiyuan Wang Qianyi Wu Linsen Song Zhuoqian Yang Wayne Wu Chen Qian Ran He Yu Qiao Chen Change Loy
[pdf]
[DOI]

Detecting Human-Object Interactions with Action Co-occurrence Priors
Dong-Jin Kim Xiao Sun Jinsoo Choi Stephen Lin In So Kweon
[pdf]
[DOI]

Learning Connectivity of Neural Networks from a Topological Perspective
Kun Yuan, Quanquan Li, Jing Shao, Junjie Yan
[pdf]
[DOI]

JSTASR: Joint Size and Transparency-Aware Snow Removal Algorithm Based on Modified Partial Convolution and Veiling Effect Removal
Wei-Ting Chen, Hao-Yu Fang, Jian-Jiun Ding, Cheng-Che Tsai, Sy-Yen Kuo
[pdf]
[DOI]

Ocean: Object-aware Anchor-free Tracking
Zhipeng Zhang, Houwen Peng, Jianlong Fu Bing Li, Weiming Hu
[pdf]
[DOI]

Object Tracking using Spatio-Temporal Networks for Future Prediction Location
Yuan Liu, Ruoteng Li, Yu Cheng, Robby T. Tan, Xiubao Sui
[pdf]
[DOI]

Pillar-based Object Detection for Autonomous Driving
Yue Wang, Alireza Fathi, Abhijit Kundu, David A. Ross, Caroline Pantofaru, Tom Funkhouser, Justin Solomon
[pdf]
[DOI]

Sparse Adversarial Attack via Perturbation Factorization
Yanbo Fan, Baoyuan Wu, Tuanhui Li, Yong Zhang, Mingyang Li, Zhifeng Li, Yujiu Yang
[pdf]
[DOI]

3D Scene Reconstruction from a Single Viewport
Maximilian Denninger, Rudolph Triebel
[pdf]
[DOI]

Learning to Optimize Domain Specific Normalization for Domain Generalization
Seonguk Seo, Yumin Suh, Dongwan Kim, Geeho Kim, Jongwoo Han, Bohyung Han
[pdf]
[DOI]

Self-supervised Outdoor Scene Relighting
Ye Yu, Abhimitra Meka, Mohamed Elgharib, Hans-Peter Seidel, Christian Theobalt, William A. P. Smith
[pdf]
[DOI]

Privacy Preserving Visual SLAM
Mikiya Shibuya, Shinya Sumikura, Ken Sakurada
[pdf]
[DOI]

Leveraging Acoustic Images for Effective Self-Supervised Audio Representation Learning
Valentina Sanguineti, Pietro Morerio, Niccolò Pozzetti, Danilo Greco, Marco Cristani, Vittorio Murino
[pdf]
[DOI]

Learning Joint Visual Semantic Matching Embeddings for Language-guided Retrieval
Yanbei Chen, Loris Bazzani
[pdf]
[DOI]

Globally Optimal and Efficient Vanishing Point Estimation in Atlanta World
Haoang Li, Pyojin Kim, Ji Zhao, Kyungdon Joo, Zhipeng Cai, Zhe Liu , Yun-Hui Liu
[pdf]
[DOI]

StyleGAN2 Distillation for Feed-forward Image Manipulation
Yuri Viazovetskyi, Vladimir Ivashkin, Evgeny Kashin
[pdf]
[DOI]

Self-Prediction for Joint Instance and Semantic Segmentation of Point Clouds
Jinxian Liu, Minghui Yu, Bingbing Ni⁴, Ye Chen
[pdf]
[DOI]

Learning Disentangled Representations via Mutual Information Estimation
Eduardo Hugo Sanchez, Mathieu Serrurier, Mathias Ortner
[pdf]
[DOI]

Challenge-Aware RGBT Tracking
Chenglong Li, Lei Liu, Andong Lu, Qing Ji, Jin Tang
[pdf]
[DOI]

Fully Trainable and Interpretable Non-Local Sparse Models for Image Restoration
Bruno Lecouat, Jean Ponce, Julien Mairal
[pdf]
[DOI]

AutoSimulate: (Quickly) Learning Synthetic Data Generation
Harkirat Singh Behl, Atilim Güneş Baydin, Ran Gal, Philip H.S. Torr, Vibhav Vineet
[pdf]
[DOI]

LatticeNet: Towards Lightweight Image Super-resolution with Lattice Block
Xiaotong Luo, Yuan Xie, Yulun Zhang, Yanyun Qu, Cuihua Li, Yun Fu
[pdf]
[DOI]

Learning from Scale-Invariant Examples for Domain Adaptation in Semantic Segmentation
M.Naseer Subhani, Mohsen Ali
[pdf]
[DOI]

Active Visual Information Gathering for Vision-Language Navigation
Hanqing Wang, Wenguan Wang, Tianmin Shu, Wei Liang, Jianbing Shen
[pdf]
[DOI]

Deep Hough-Transform Line Priors
Yancong Lin, Silvia L. Pintea, Jan C. van Gemert
[pdf]
[DOI]

Unsupervised Shape and Pose Disentanglement for 3D Meshes
Keyang Zhou, Bharat Lal Bhatnagar, Gerard Pons-Moll
[pdf]
[DOI]

CLAWS: Clustering Assisted Weakly Supervised Learning with Normalcy Suppression for Anomalous Event Detection
Muhammad Zaigham Zaheer, Arif Mahmood, Marcella Astrid, Seung-Ik Lee
[pdf]
[DOI]

Inclusive GAN: Improving Data and Minority Coverage in Generative Models
Ning Yu, Ke Li, Peng Zhou Jitendra Malik, Larry Davis, Mario Fritz
[pdf]
[DOI]

SESAME: Semantic Editing of Scenes by Adding, Manipulating or Erasing Objects
Evangelos Ntavelis, Andrés Romero, Iason Kastanis, Luc Van Gool, Radu Timofte
[pdf]
[DOI]

Dive Deeper Into Box for Object Detection
Ran Chen, Yong Liu, Mengdan Zhang, Shu Liu, Bei Yu, Yu-Wing Tai
[pdf]
[DOI]

PG-Net: Pixel to Global Matching Network for Visual Tracking
Bingyan Liao, Chenye Wang, Yayun Wang, Yaonong Wang, Jun Yin
[pdf]
[DOI]

Why Are Deep Representations Good Perceptual Quality Features?
Taimoor Tariq, Okan Tarhan Tursun, Munchurl Kim, Piotr Didyk
[pdf]
[DOI]

Geometric Estimation via Robust Subspace Recovery
Aoxiang Fan, Xingyu Jiang, Yang Wang, Junjun Jiang, Jiayi Ma
[pdf]
[DOI]

Latent Embedding Feedback and Discriminative Features for Zero-Shot Classification
Sanath Narayan, Akshita Gupta, Fahad Shahbaz Khan, Cees G. M. Snoek, Ling Shao
[pdf]
[DOI]

Human Correspondence Consensus for 3D Object Semantic Understanding
Yujing Lou, Yang You, Chengkun Li, Zhoujun Cheng, Liangwei Li, Lizhuang Ma, Weiming Wang, Cewu Lu
[pdf]
[DOI]

Learning Memory Augmented Cascading Network for Compressed Sensing of Images
Jiwei Chen, Yubao Sun, Qingshan Liu, Rui Huang
[pdf]
[DOI]

Least squares surface reconstruction on arbitrary domains
Dizhong Zhu, William A. P. Smith
[pdf]
[DOI]

Task-conditioned Domain Adaptation for Pedestrian Detection in Thermal Imagery
My Kieu, Andrew D. Bagdanov, Marco Bertini, Alberto del Bimbo
[pdf]
[DOI]

Improving the Transferability of Adversarial Examples with Resized-Diverse-Inputs, Diversity-Ensemble and Region Fitting
Junhua Zou, Zhisong Pan, Junyang Qiu, Xin Liu, Ting Rui, Wei Li
[pdf]
[DOI]

DADA: Differentiable Automatic Data Augmentation
Yonggang Li, Guosheng Hu, Yongtao Wang, Timothy Hospedales, Neil M. Robertson, Yongxin Yang
[pdf]
[DOI]

SceneCAD: Predicting Object Alignments and Layouts in RGB-D Scans
Armen Avetisyan, Tatiana Khanova, Christopher Choy, Denver Dash, Angela Dai, Matthias Nießner
[pdf]
[DOI]

Kinship Identification through Joint Learning using Kinship Verification Ensembles
Wei Wang, Shaodi You, Theo Gevers
[pdf]
[DOI]

Kernelized Memory Network for Video Object Segmentation
Hongje Seong, Junhyuk Hyun, Euntai Kim
[pdf]
[DOI]

A Single Stream Network for Robust and Real-time RGB-D Salient Object Detection
Xiaoqi Zhao, Lihe Zhang¹, Youwei Pang, Huchuan Lu, Lei Zhang
[pdf]
[DOI]

Splitting vs. Merging: Mining Object Regions with Discrepancy and Intersection Loss for Weakly Supervised Semantic Segmentation
Tianyi Zhang, Guosheng Lin, Weide Liu, Jianfei Cai, Alex Kot
[pdf]
[DOI]

Temporal Keypoint Matching and Refinement Network for Pose Estimation and Tracking
Chunluan Zhou Zhou Ren Gang Hua
[pdf]
[DOI]

Neural Point-Based Graphics
Kara-Ali Aliev, Artem Sevastopolsky, Maria Kolos, Dmitry Ulyanov, Victor Lempitsky
[pdf]
[DOI]

FHDe²Net: Full High Definition Demoireing Network
Bin He, Ce Wang, Boxin Shi, Ling-Yu Duan
[pdf]
[DOI]

Learning Structural Similarity of User Interface Layouts using Graph Networks
Dipu Manandhar, Dan Ruta, John Collomosse
[pdf]
[DOI]

NAS-Count: Counting-by-Density with Neural Architecture Search
Yutao Hu ¹, Xiaolong Jiang ², Xuhui Liu, Baochang Zhang, Jungong Han, Xianbin Cao ², David Doermann
[pdf]
[DOI]

Towards Generalization Across Depth for Monocular 3D Object Detection
Andrea Simonelli, Samuel Rota Buló, Lorenzo Porzi, Elisa Ricci, Peter Kontschieder
[pdf]
[DOI]

Margin-Mix: Semi–Supervised Learning for Face Expression Recognition
Corneliu Florea, Mihai Badea, Laura Florea, Andrei Racoviteanu, Constantin Vertan
[pdf]
[DOI]

Principal Feature Visualisation in Convolutional Neural Networks
Marianne Bakken, Johannes Kvam, Alexey A. Stepanov, Asbjørn Berge
[pdf]
[DOI]

Progressive Refinement Network for Occluded Pedestrian Detection
Xiaolin Song Kaili Zhao Wen-Sheng Chu Honggang Zhang Jun Guo
[pdf]
[DOI]

Monocular Real-Time Volumetric Performance Capture
Ruilong Li, Yuliang Xiu, Shunsuke Saito, Zeng Huang, Kyle Olsewski, Hao Li
[pdf]
[DOI]

The Mapillary Traffic Sign Dataset for Detection and Classification on a Global Scale
Christian Ertler, Jerneja Mislej, Tobias Ollmann, Lorenzo Porzi, Gerhard Neuhold, Yubin Kuang
[pdf]
[DOI]

Measuring Generalisation to Unseen Viewpoints, Articulations, Shapes and Objects for 3D Hand Pose Estimation under Hand-Object Interaction
Anil Armagan, Guillermo Garcia-Hernando, Seungryul Baek, Shreyas Hampali, Mahdi Rad, Zhaohui Zhang, Shipeng Xie, MingXiu Chen, Boshen Zhang, Fu Xiong, Yang Xiao, Zhiguo Cao, Junsong Yuan, Pengfei Ren⁸, Weiting Huang⁸, Haifeng Sun⁸, Marek Hrúz⁹, Jakub Kanis⁹, Zdeněk Krňoul⁹, Qingfu Wan, Shile Li, Linlin Yang, Dongheui Lee, Angela Yao, Weiguo Zhou, Sijia Mei, Yunhui Liu, Adrian Spurr, Umar Iqbal, Pavlo Molchanov, Philippe Weinzaepfel, Romain Brégier, Grégory Rogez, Vincent Lepetit, Tae-Kyun Kim
[pdf]
[DOI]

Disentangling Multiple Features in Video Sequences using Gaussian Processes in Variational Autoencoders
Sarthak Bhagat, Shagun Uppal, Zhuyun Yin, Nengli Lim
[pdf]
[DOI]

SEN: A Novel Feature Normalization Dissimilarity Measure for Prototypical Few-Shot Learning Networks
Van Nhan Nguyen, Sigurd Løkse, Kristoffer Wickstrøm, Michael Kampffmeyer, Davide Roverso, Robert Jenssen
[pdf]
[DOI]

Kinematic 3D Object Detection in Monocular Video
Garrick Brazil, Gerard Pons-Moll, Xiaoming Liu, Bernt Schiele
[pdf]
[DOI]

Describing Unseen Videos via Multi-Modal Cooperative Dialog Agents
Ye Zhu, Yu Wu, Yi Yang, Yan Yan
[pdf]
[DOI]

SACA Net: Cybersickness Assessment of Individual Viewers for VR Content via Graph-based Symptom Relation Embedding
Sangmin Lee, Jung Uk Kim, Hak Gu Kim, Seongyeop Kim, Yong Man Ro
[pdf]
[DOI]

End-to-End Low Cost Compressive Spectral Imaging with Spatial-Spectral Self-Attention
Ziyi Meng, Jiawei Ma, Xin Yuan
[pdf]
[DOI]

Know Your Surroundings: Exploiting Scene Information for Object Tracking
Goutam Bhat, Martin Danelljan, Luc Van Gool, Radu Timofte
[pdf]
[DOI]

Practical Detection of Trojan Neural Networks: Data-Limited and Data-Free Cases
Ren Wang, Gaoyuan Zhang, Sijia Liu, Pin-Yu Chen, Jinjun Xiong, Meng Wang
[pdf]
[DOI]

Anatomy-Aware Siamese Network: Exploiting Semantic Asymmetry for Accurate Pelvic Fracture Detection in X-ray Images
Haomin Chen, Yirui Wang, Kang Zheng, Weijian Li, Chi-Tung Chang, Adam P. Harrison, Jing Xiao, Gregory D. Hager, Le Lu, Chien-Hung Liao, Shun Miao
[pdf]
[DOI]

DeepLandscape: Adversarial Modeling of Landscape Videos
Elizaveta Logacheva, Roman Suvorov, Oleg Khomenko, Anton Mashikhin, Victor Lempitsky
[pdf]
[DOI]

GANwriting: Content-Conditioned Generation of Styled Handwritten Word Images
Lei Kang, Pau Riba, Yaxing Wang, Marçal Rusiñol, Alicia Fornés, Mauricio Villegas
[pdf]
[DOI]

Spatial-Angular Interaction for Light Field Image Super-Resolution
Yingqian Wang,  Longguang Wang,  Jungang Yang, Wei An,  Jingyi Yu,  Yulan Guo
[pdf]
[DOI]

BATS: Binary ArchitecTure Search
Adrian Bulat, Brais Martinez, Georgios Tzimiropoulos
[pdf]
[DOI]

A Closer Look at Local Aggregation Operators in Point Cloud Analysis
Ze Liu(†), Han Hu, Yue Cao, Zheng Zhang, Xin Tong
[pdf]
[DOI]

Look here! A parametric learning based approach to redirect visual attention
Youssef A. Mejjati, Celso F. Gomez, Kwang In Kim, Eli Shechtman, Zoya Bylinskii
[pdf]
[DOI]

Variational Diffusion Autoencoders with Random Walk Sampling
Henry Li, Ofir Lindenbaum, Xiuyuan Cheng, Alexander Cloninger
[pdf]
[DOI]

Adaptive Variance Based Label Distribution Learning For Facial Age Estimation
Xin Wen, Biying Li, Haiyun Guo, Zhiwei Liu, Guosheng Hu, Ming Tang, Jinqiao Wang
[pdf]
[DOI]

Connecting the Dots: Detecting Adversarial Perturbations Using Context Inconsistency
Shasha Li, Shitong Zhu, Sudipta Paul, Amit Roy-Chowdhury, Chengyu Song, Srikanth Krishnamurthy, Ananthram Swami, Kevin S Chan
[pdf]
[DOI]

Perceive, Predict, and Plan: Safe Motion Planning Through Interpretable Semantic Representations
Abbas Sadat, Sergio Casas, Mengye Ren, Xinyu Wu, Pranaab Dhawan, Raquel Urtasun
[pdf]
[DOI]

VarSR: Variational Super-Resolution Network for Very Low Resolution Images
Sangeek Hyun, Jae-Pil Heo
[pdf]
[DOI]

Co-Heterogeneous and Adaptive Segmentation from Multi-Source and Multi-Phase CT Imaging Data: A Study on Pathological Liver and Lesion Segmentation
Ashwin Raju, Chi-Tung Cheng, Yuankai Huo, Jinzheng Cai, Junzhou Huang, Jing Xiao, Le Lu, ChienHung Liao, Adam P. Harrison
[pdf]
[DOI]

Towards Recognizing Unseen Categories in Unseen Domains
Massimiliano Mancini, Zeynep Akata, Elisa Ricci, Barbara Caputo
[pdf]
[DOI]

Square Attack: a query-efficient black-box adversarial attack via random search
Maksym Andriushchenko, Francesco Croce, Nicolas Flammarion, Matthias Hein
[pdf]
[DOI]

You Are Here: Geolocation by Embedding Maps and Images
Noe Samano, Mengjie Zhou, Andrew Calway
[pdf]
[DOI]

Segmentations-Leak: Membership Inference Attacks and Defenses in Semantic Image Segmentation
Yang He, Shadi Rahimian, Bernt Schiele, Mario Fritz
[pdf]
[DOI]

From Image to Stability: Learning Dynamics from Human Pose
Jesse Scott, Bharadwaj Ravichandran, Christopher Funk, Robert T. Collins, Yanxi Liu
[pdf]
[DOI]

LevelSet R-CNN: A Deep Variational Method for Instance Segmentation
Namdar Homayounfar Yuwen Xiong Justin Liang Wei-Chiu Ma Raquel Urtasun {namdar,yuwen,justin.liang,weichiu,urtasun}@uber.com
[pdf]
[DOI]

Efficient Scale-Permuted Backbone with Learned Resource Distribution
Xianzhi Du, Tsung-Yi Lin, Pengchong Jin, Yin Cui Mingxing Tan, Quoc Le, Xiaodan Song
[pdf]
[DOI]

Reducing Distributional Uncertainty by Mutual Information Maximisation and Transferable Feature Learning
Jian Gao, Yang Hua, Guosheng Hu, Chi Wang, Neil M. Robertson
[pdf]
[DOI]

Bridging Knowledge Graphs to Generate Scene Graphs
Alireza Zareian, Svebor Karaman, Shih-Fu Chang
[pdf]
[DOI]

Implicit Latent Variable Model for Scene-Consistent Motion Forecasting
Sergio Casas, Cole Gulino, Simon Suo, Katie Luo, Renjie Liao, Raquel Urtasun
[pdf]
[DOI]

Learning Visual Commonsense for Robust Scene Graph Generation
Alireza Zareian, Zhecan Wang, Haoxuan You, Shih-Fu Chang
[pdf]
[DOI]

MPCC: Matching Priors and Conditionals for Clustering
Nicolás Astorga, Pablo Huijse, Pavlos Protopapas, Pablo Estévez
[pdf]
[DOI]

PointAR: Efficient Lighting Estimation for Mobile Augmented Reality
Yiqin Zhao, Tian Guo
[pdf]
[DOI]

Discrete Point Flow Networks for Efficient Point Cloud Generation
Roman Klokov, Edmond Boyer, Jakob Verbeek
[pdf]
[DOI]

Accelerating Deep Learning with Millions of Classes
Zhuoning Yuan, Zhishuai Guo, Xiaotian Yu, Xiaoyu Wang, Tianbao Yang
[pdf]
[DOI]

Password-conditioned Anonymization and Deanonymization with Face Identity Transformers
Xiuye Gu, Weixin Luo, Michael S. Ryoo, Yong Jae Lee
[pdf]
[DOI]

Inertial Safety from Structured Light
Sizhuo Ma, Mohit Gupta
[pdf]
[DOI]

PointTriNet: Learned Triangulation of 3D Point Sets
Nicholas Sharp, Maks Ovsjanikov
[pdf]
[DOI]

Toward Unsupervised, Multi-Object Discovery in Large-Scale Image Collections
Huy V. Vo, Patrick Pérez, Jean Ponce
[pdf]
[DOI]

Deep Novel View Synthesis from Colored 3D Point Clouds
Zhenbo Song, Wayne Chen, Dylan Campbell, Hongdong Li
[pdf]
[DOI]

Consensus-Aware Visual-Semantic Embedding for Image-Text Matching
Haoran Wang, Ying Zhang, Zhong Ji, Yanwei Pang, Lin Ma
[pdf]
[DOI]

Spatial Hierarchy Aware Residual Pyramid Network for Time-of-Flight Depth Denoising
Guanting Dong, Yueyi Zhang, Zhiwei Xiong
[pdf]
[DOI]

Sat2Graph: Road Graph Extraction through Graph-Tensor Encoding
Songtao He, Favyen Bastani, Satvat Jagwani, Mohammad Alizadeh, Hari Balakrishnan, Sanjay Chawla, Mohamed M. Elshrif, Samuel Madden, Mohammad Amin Sadeghi
[pdf]
[DOI]

Cross-Task Transfer for Geotagged Audiovisual Aerial Scene Recognition
Di Hu, Xuhong Li, Lichao Mou, Pu Jin, Dong Chen, Liping Jing, Xiaoxiang Zhu, Dejing Dou
[pdf]
[DOI]

Polarimetric Multi-View Inverse Rendering
Jinyu Zhao, Yusuke Monno, Masatoshi Okutomi
[pdf]
[DOI]

SideInfNet: A Deep Neural Network for Semi-Automatic Semantic Segmentation with Side Information
Jing Yu Koh, Duc Thanh Nguyen, Quang-Trung Truong, Sai-Kit Yeung, Alexander Binder
[pdf]
[DOI]

Improving Face Recognition by Clustering Unlabeled Faces in the Wild
Aruni RoyChowdhury, Xiang Yu, Kihyuk Sohn, Erik Learned-Miller, Manmohan Chandraker
[pdf]
[DOI]

NeuRoRA: Neural Robust Rotation Averaging
Pulak Purkait, Tat-Jun Chin, Ian Reid
[pdf]
[DOI]

SG-VAE: Scene Grammar Variational Autoencoder to generate new indoor scenes
Pulak Purkait, Christopher Zach, Ian Reid
[pdf]
[DOI]

Unsupervised Learning of Optical Flow with Deep Feature Similarity
Woobin Im, Tae-Kyun Kim, Sung-Eui Yoon
[pdf]
[DOI]

Blended Grammar Network for Human Parsing
Xiaomei Zhang, Yingying Chen, Bingke Zhu, Jinqiao Wang, Ming Tang
[pdf]
[DOI]

P²Net: Patch-match and Plane-regularization for Unsupervised Indoor Depth Estimation
Zehao Yu, Lei Jin, Shenghua Gao
[pdf]
[DOI]

Efficient Attention Mechanism for Visual Dialog that can Handle All the Interactions between Multiple Inputs
Van-Quang Nguyen, Masanori Suganuma, Takayuki Okatani
[pdf]
[DOI]

Adaptive Mixture Regression Network with Local Counting Map for Crowd Counting
Xiyang Liu, Jie Yang, Wenrui Ding, Tieqiang Wang, Zhijin Wang, Junjun Xiong
[pdf]
[DOI]

BIRNAT: Bidirectional Recurrent Neural Networks with Adversarial Training for Video Snapshot Compressive Imaging
Ziheng Cheng, Ruiying Lu, Zhengjue Wang, Hao Zhang, Bo Chen, Ziyi Meng, Xin Yuan
[pdf]
[DOI]

Ultra Fast Structure-aware Deep Lane Detection
Zequn Qin, Huanyu Wang, Xi Li
[pdf]
[DOI]

Cross-Identity Motion Transfer for Arbitrary Objects through Pose-Attentive Video Reassembling
Subin Jeon, Seonghyeon Nam, Seoung Wug Oh, Seon Joo Kim
[pdf]
[DOI]

Domain Adaptive Object Detection via Asymmetric Tri-way Faster-RCNN
Zhenwei He, Lei Zhang
[pdf]
[DOI]

Exclusivity-Consistency Regularized Knowledge Distillation for Face Recognition
Xiaobo Wang, Tianyu Fu, Shengcai Liao, Shuo Wang, Zhen Lei, Tao Mei
[pdf]
[DOI]

Learning Camera-Aware Noise Models
Ke-Chi Chang, Ren Wang, Hung-Jin Lin, Yu-Lun Liu, Chia-Ping Chen, Yu-Lin Chang, Hwann-Tzong Chen
[pdf]
[DOI]

Towards Precise Completion of Deformable Shapes
Oshri Halimi, Ido Imanuel, Or Litany, Giovanni Trappolini, Emanuele Rodolà, Leonidas Guibas, Ron Kimmel
[pdf]
[DOI]

Iterative Distance-Aware Similarity Matrix Convolution with Mutual-Supervised Point Elimination for Efficient Point Cloud Registration
Jiahao Li, Changhao Zhang, Ziyao Xu, Hangning Zhou, Chi Zhang
[pdf]
[DOI]

Pairwise Similarity Knowledge Transfer for Weakly Supervised Object Localization
Amir Rahimi, Amirreza Shaban, Thalaiyasingam Ajanthan, Richard Hartley, Byron Boots
[pdf]
[DOI]

Environment-agnostic Multitask Learning for Natural Language Grounded Navigation
Xin Eric Wang, Vihan Jain, Eugene Ie, William Yang Wang, Zornitsa Kozareva, Sujith Ravi[2]
[pdf]
[DOI]

TPFN: Applying Outer Product along Time to Multimodal Sentiment Analysis Fusion on Incomplete Data
Binghua Li, Chao Li, Feng Duan, Ning Zheng, Qibin Zhao
[pdf]
[DOI]

ProxyNCA++: Revisiting and Revitalizing Proxy Neighborhood Component Analysis
Eu Wern Teh, Terrance DeVries, Graham W. Taylor
[pdf]
[DOI]

Learning with Privileged Information for Efficient Image Super-Resolution
Wonkyung Lee, Junghyup Lee, Dohyung Kim, Bumsub Ham
[pdf]
[DOI]

Joint Visual and Temporal Consistency for Unsupervised Domain Adaptive Person Re-Identification
Jianing Li,, Shiliang Zhang
[pdf]
[DOI]

Autoencoder-based Graph Construction for Semi-supervised Learning
Mingeun Kang, Kiwon Lee, Yong H. Lee, Changho Suh
[pdf]
[DOI]

Virtual Multi-view Fusion for 3D Semantic Segmentation
Abhijit Kundu, Xiaoqi Yin, Alireza Fathi, David Ross, Brian Brewington, Thomas Funkhouser, Caroline Pantofaru
[pdf]
[DOI]

Decoupling GCN with DropGraph Module for Skeleton-Based Action Recognition
Ke Cheng, Yifan Zhang, Congqi Cao, Lei Shi, Jian Cheng, Hanqing Lu
[pdf]
[DOI]

Deep Shape from Polarization
Yunhao Ba, Alex Gilbert, Franklin Wang, Jinfa Yang, Rui Chen, Yiqin Wang, Lei Yan, Boxin Shi, Achuta Kadambi
[pdf]
[DOI]

A Boundary Based Out-of-Distribution Classifier for Generalized Zero-Shot Learning
Xingyu Chen, Xuguang Lan, Fuchun Sun, Nanning Zheng
[pdf]
[DOI]

Mind the Discriminability: Asymmetric Adversarial Domain Adaptation
Jianfei Yang, Han Zou, Yuxun Zhou, Zhaoyang Zeng, Lihua Xie ()
[pdf]
[DOI]

SeqXY2SeqZ: Structure Learning for 3D Shapes by Sequentially Predicting 1D Occupancy Segments From 2D Coordinates
Zhizhong Han, Guanhui Qiao, Yu-Shen Liu, Matthias Zwicker
[pdf]
[DOI]

Simultaneous Detection and Tracking with Motion Modelling for Multiple Object Tracking
ShiJie Sun, Naveed Akhtar, XiangYu Song, HuanSheng Song, Ajmal Mian , Mubarak Shah
[pdf]
[DOI]

Deep FusionNet for Point Cloud Semantic Segmentation
Feihu Zhang Jin Fang Benjamin Wah Philip Torr
[pdf]
[DOI]

Deep Material Recognition in Light-Fields via Disentanglement of Spatial and Angular Information
Bichuan Guo, Jiangtao Wen, Yuxing Han
[pdf]
[DOI]

Dual Adversarial Network for Deep Active Learning
Shuo Wang, Yuexiang Li, Kai Ma, Ruhui Ma, Haibing Guan, Yefeng Zheng
[pdf]
[DOI]

Fully Convolutional Networks for Continuous Sign Language Recognition
Ka Leong Cheng, Zhaoyang Yang, Qifeng Chen, Yu-Wing Tai
[pdf]
[DOI]

Self-adapting confidence estimation for stereo
Matteo Poggi, Filippo Aleotti, Fabio Tosi, Giulio Zaccaroni, Stefano Mattoccia
[pdf]
[DOI]

Deep Surface Normal Estimation on the 2-Sphere with Confidence Guided Semantic Attention
Quewei Li, Jie Guo, Yang Fei, Qinyu Tang, Wenxiu Sun, Jin Zeng, Yanwen Guo
[pdf]
[DOI]

AutoSTR: Efficient Backbone Search for Scene Text Recognition
Hui Zhang, Quanming Yao, Mingkun Yang, Yongchao Xu, Xiang Bai
[pdf]
[DOI]

Mitigating Embedding and Class Assignment Mismatch in Unsupervised Image Classification
Sungwon Han, Sungwon Park, Sungkyu Park, Sundong Kim, Meeyoung Cha
[pdf]
[DOI]

Adversarial Training with Bi-directional Likelihood Regularization for Visual Classification
Weitao Wan, Jiansheng Chen, Ming-Hsuan Yang
[pdf]
[DOI]

Faster AutoAugment: Learning Augmentation Strategies Using Backpropagation
Ryuichiro Hataya, Zdenek Jan, Kazuki Yoshizoe, Hideki Nakayama
[pdf]
[DOI]

Hand-Transformer: Non-Autoregressive Structured Modeling for 3D Hand Pose Estimation
Lin Huang, Jianchao Tan, Ji Liu, Junsong Yuan
[pdf]
[DOI]

Boundary-Aware Cascade Networks for Temporal Action Segmentation
Zhenzhi Wang, Ziteng Gao, Limin Wang, Zhifeng Li, Gangshan Wu
[pdf]
[DOI]

Towards Content-Independent Multi-Reference Super-Resolution: Adaptive Pattern Matching and Feature Aggregation
Xu Yan, Weibing Zhao, Kun Yuan, Ruimao Zhang, Zhen Li, Shuguang Cui
[pdf]
[DOI]

Inference Graphs for CNN Interpretation
Yael Konforti, Alon Shpigler, Boaz Lerner, Aharon Bar-Hillel
[pdf]
[DOI]

An End-to-End OCR Text Re-organization Sequence Learning for Rich-text Detail Image Comprehension
Liangcheng Li, Feiyu Gao, Jiajun Bu, Yongpan Wang, Zhi Yu, Qi Zheng
[pdf]
[DOI]

Improving Query Efficiency of Black-box Adversarial Attack
Yang Bai, Yuyuan Zeng, Yong Jiang, Yisen Wang, Shu-Tao Xia, Weiwei Guo
[pdf]
[DOI]

Self-similarity Student for Partial Label Histopathology Image Segmentation
Hsien-Tzu Cheng, Chun-Fu Yeh, Po-Chen Kuo, Andy Wei, Keng-Chi Liu, Mong-Chi Ko, Kuan-Hua Chao, Yu-Ching Peng, Tyng-Luh Liu
[pdf]
[DOI]

BioMetricNet: deep unconstrained face verification through learning of metrics regularized onto Gaussian distributions
Arslan Ali, Matteo Testa, Tiziano Bianchi, Enrico Magli
[pdf]
[DOI]

A Decoupled Learning Scheme for Real-world Burst Denoising from Raw Images
Zhetong Liang, Shi Guo, Hong Gu, Huaqi Zhang, Lei Zhang
[pdf]
[DOI]

Global-and-Local Relative Position Embedding for Unsupervised Video Summarization
Yunjae Jung, Donghyeon Cho, Sanghyun Woo, In So Kweon
[pdf]
[DOI]

Real-World Blur Dataset for Learning and Benchmarking Deblurring Algorithms
Jaesung Rim, Haeyun Lee, Jucheol Won, Sunghyun Cho
[pdf]
[DOI]

SPARK: Spatial-aware Online Incremental Attack Against Visual Tracking
Qing Guo, Xiaofei Xie, Felix Juefei-Xu, Lei Ma, Zhongguo Li, Wanli Xue, Wei Feng, Yang Liu
[pdf]
[DOI]

CenterNet Heatmap Propagation for Real-time Video Object Detection
Zhujun Xu, Emir Hrustic, Damien Vivet
[pdf]
[DOI]

Hierarchical Dynamic Filtering Network for RGB-D Salient Object Detection
Youwei Pang, Lihe Zhang, Xiaoqi Zhao, Huchuan Lu
[pdf]
[DOI]

SOLAR: Second-Order Loss and Attention for Image Retrieval
Tony Ng, Vassileios Balntas, Yurun Tian, Krystian Mikolajczyk
[pdf]
[DOI]

Fixing Localization Errors to Improve Image Classification
Guolei Sun, Salman Khan, Wen Li, Hisham Cholakkal, Fahad Shahbaz Khan, Luc Van Gool
[pdf]
[DOI]

PatchPerPix for Instance Segmentation
Lisa Mais, Peter Hirsch and Dagmar Kainmueller
[pdf]
[DOI]

Attend and Segment: Attention Guided Active Semantic Segmentation
Soroush Seifi, Tinne Tuytelaars
[pdf]
[DOI]

Accelerating CNN Training by Pruning Activation Gradients
Xucheng Ye, Pengcheng Dai, Junyu Luo, Xin Guo, Yingjie Qi, Jianlei Yang, Yiran Chen
[pdf]
[DOI]

Global and Local Enhancement Networks for Paired and Unpaired Image Enhancement
Han-Ul Kim, Young Jun Koh, Chang-Su Kim
[pdf]
[DOI]

Probabilistic Anchor Assignment with IoU Prediction for Object Detection
Kang Kim, Hee Seok Lee
[pdf]
[DOI]

Eyeglasses 3D shape reconstruction from a single face image
Yating Wang, Quan Wang, Feng Xu
[pdf]
[DOI]

Temporal Complementary Learning for Video Person Re-Identification
Ruibing Hou, Hong Chang, Bingpeng Ma, Shiguang Shan, Xilin Chen
[pdf]
[DOI]

HoughNet: Integrating near and long-range evidence for bottom-up object detection
Nermin Samet, Samet Hicsonmez, Emre Akbas
[pdf]
[DOI]

Graph Wasserstein Correlation Analysis for Movie Retrieval
Xueya Zhang, Tong Zhang, Xiaobin Hong, Zhen Cui, Jian Yang
[pdf]
[DOI]

Context-Aware RCNN: A Baseline for Action Detection in Videos
Jianchao Wu, Zhanghui Kuang, Limin Wang, Wayne Zhang, Gangshan Wu
[pdf]
[DOI]

Full-Time Monocular Road Detection Using Zero-Distribution Prior of Angle of Polarization
Ning Li, Yongqiang Zhao, Quan Pan, Seong G. Kong, Jonathan Cheung-Wai Chan
[pdf]
[DOI]

A Flexible Recurrent Residual Pyramid Network for Video Frame Interpolation
Haoxian Zhang, Yang Zhao, Ronggang Wang
[pdf]
[DOI]

Learning Enriched Features for Real Image Restoration and Enhancement
Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Ming-Hsuan Yang, Ling Shao
[pdf]
[DOI]

Detail Preserved Point Cloud Completion via Separated Feature Aggregation
Wenxiao Zhang, Qingan Yan, Chunxia Xiao
[pdf]
[DOI]

LabelEnc: A New Intermediate Supervision Method for Object Detection
Miao Hao, Yitao Liu, Xiangyu Zhang, Jian Sun
[pdf]
[DOI]

Unsupervised Learning of Category-Specific Symmetric 3D Keypoints from Point Sets
Clara Fernandez-Labrador, Ajad Chhatkuli, Danda Pani Paudel, Jose J. Guerrero, Cédric Demonceaux, Luc Van Gool
[pdf]
[DOI]

PAMS: Quantized Super-Resolution via Parameterized Max Scale
Huixia Li, Chenqian Yan, Shaohui Lin, Xiawu Zheng, Baochang Zhang, Fan Yang, Rongrong Ji
[pdf]
[DOI]

SSN: Shape Signature Networks for Multi-class Object Detection from Point Clouds
Xinge Zhu      Yuexin Ma      Tai Wang      Yan Xu Jianping Shi      Dahua Lin
[pdf]
[DOI]

OID: Outlier Identifying and Discarding in Blind Image Deblurring
Liang Chen, Faming Fang, Jiawei Zhang, Jun Liu, Guixu Zhang
[pdf]
[DOI]

Few-Shot Single-View 3-D Object Reconstruction with Compositional Priors
Mateusz Michalkiewicz, Sarah Parisot, Stavros Tsogkas, Mahsa Baktashmotlagh, Anders Eriksson, Eugene Belilovsky
[pdf]
[DOI]

Enhanced Sparse Model for Blind Deblurring
Liang Chen, Faming Fang, Shen Lei, Fang Li, Guixu Zhang
[pdf]
[DOI]

SumGraph: Video Summarization via Recursive Graph Modeling
Jungin Park, Jiyoung Lee, Ig-Jae Kim, Kwanghoon Sohn
[pdf]
[DOI]

Feature Normalized Knowledge Distillation for Image Classification
Kunran Xu, Lai Rui, Yishi Li, Lin Gu
[pdf]
[DOI]

A Metric Learning Reality Check
Kevin Musgrave, Serge Belongie, Ser-Nam Lim
[pdf]
[DOI]

FTL: A universal framework for training low-bit DNNs via Feature Transfer
Kunyuan Du, Ya Zhang, Haibing Guan, Qi Tian, Shenggan Cheng, James Lin
[pdf]
[DOI]

XingGAN for Person Image Generation
Hao Tang, Song Bai, Li Zhang, Philip H.S. Torr, Nicu Sebe
[pdf]
[DOI]

GATCluster: Self-Supervised Gaussian-Attention Network for Image Clustering
Chuang Niu, Jun Zhang, Ge Wang, Jimin Liang
[pdf]
[DOI]

VCNet: A Robust Approach to Blind Image Inpainting
Yi Wang, Ying-Cong Chen, Xin Tao, Jiaya Jia
[pdf]
[DOI]

Learning to Predict Context-adaptive Convolution for Semantic Segmentation
Jianbo Liu, Junjun He, Yu Qiao, Jimmy S. Ren, Hongsheng Li
[pdf]
[DOI]

EfficientFCN: Holistically-guided Decoding for Semantic Segmentation
Jianbo Liu, Junjun He, Jiawei Zhang, Jimmy S. Ren, Hongsheng Li
[pdf]
[DOI]

GroSS: Group-Size Series Decomposition for Grouped Architecture Search
Henry Howard-Jenkins, Yiwen Li, Victor Adrian Prisacariu
[pdf]
[DOI]

Efficient Adversarial Attacks for Visual Object Tracking
Siyuan Liang, Xingxing Wei, Siyuan Yao, Xiaochun Cao
[pdf]
[DOI]

Globally-Optimal Event Camera Motion Estimation
Xin Peng, Yifu Wang, Ling Gao, Laurent Kneip
[pdf]
[DOI]

Weakly-supervised Learning of Human Dynamics
Petrissa Zell, Bodo Rosenhahn, Bastian Wandt
[pdf]
[DOI]

Journey Towards Tiny Perceptual Super-Resolution
Royson Lee, Łukasz Dudziak, Mohamed Abdelfattah, Stylianos I. Venieris, Hyeji Kim, Hongkai Wen, Nicholas D. Lane
[pdf]
[DOI]

What makes fake images detectable? Understanding properties that generalize
Lucy Chai, David Bau, Ser-Nam Lim, Phillip Isola
[pdf]
[DOI]

Embedding Propagation: Smoother Manifold for Few-Shot Classification
Pau Rodríguez, Issam Laradji, Alexandre Drouin, Alexandre Lacoste
[pdf]
[DOI]

Category Level Object Pose Estimation via Neural Analysis-by-Synthesis
Xu Chen, Zijian Dong, Jie Song, Andreas Geiger, Otmar Hilliges
[pdf]
[DOI]

High-Fidelity Synthesis with Disentangled Representation
Wonkwang Lee, Donggyun Kim, Seunghoon Hong, Honglak Lee
[pdf]
[DOI]

PL₁P - Point-line Minimal Problems under Partial Visibility in Three Views
Timothy Duff, Kathlén Kohn, Anton Leykin, Tomas Pajdla
[pdf]
[DOI]

Prediction and Recovery for Adaptive Low-Resolution Person Re-Identification
Ke Han, Yan Huang, Zerui Chen, Liang Wang, Tieniu Tan
[pdf]
[DOI]

Learning Canonical Representations for Scene Graph to Image Generation
Roei Herzig, Amir Bar, Huijuan Xu, Gal Chechik, Trevor Darrell, Amir Globerson
[pdf]
[DOI]

Adversarial Robustness on In- and Out-Distribution Improves Explainability
Maximilian Augustin, Alexander Meinke, Matthias Hein
[pdf]
[DOI]

Deformable Style Transfer
Sunnie S. Y. Kim, Nicholas Kolkin, Jason Salavon, Gregory Shakhnarovich
[pdf]
[DOI]

Aligning Videos in Space and Time
Senthil Purushwalkam, Tian Ye, Saurabh Gupta, Abhinav Gupta
[pdf]
[DOI]

Neural Wireframe Renderer: Learning Wireframe to Image Translations
Yuan Xue, Zihan Zhou, Xiaolei Huang
[pdf]
[DOI]

RBF-Softmax: Learning Deep Representative Prototypes with Radial Basis Function Softmax
Xiao Zhang, Rui Zhao, Yu Qiao, Hongsheng Li
[pdf]
[DOI]

Testing the Safety of Self-driving Vehicles by Simulating Perception and Prediction
Kelvin Wong, Qiang Zhang, Ming Liang, Bin Yang, Renjie Liao, Abbas Sadat, Raquel Urtasun
[pdf]
[DOI]

Determining the Relevance of Features for Deep Neural Networks
Christian Reimers, Jakob Runge, Joachim Denzler
[pdf]
[DOI]

Weakly Supervised Semantic Segmentation with Boundary Exploration
Liyi Chen, Weiwei Wu, Chenchen Fu, Xiao Han, Yuntao Zhang
[pdf]
[DOI]

GANHopper: Multi-Hop GAN for Unsupervised Image-to-Image Translation
Wallace Lira, Johannes Merz, Daniel Ritchie, Daniel Cohen-Or, Hao Zhang
[pdf]
[DOI]

DOPE: Distillation Of Part Experts for whole-body 3D pose estimation in the wild
Philippe Weinzaepfel, Romain Brégier, Hadrien Combaluzier, Vincent Leroy, Grégory Rogez
[pdf]
[DOI]

Multi-view adaptive graph convolutions for graph classification
Nikolas Adaloglou, Nicholas Vretos, Petros Daras
[pdf]
[DOI]

Instance Adaptive Self-Training for Unsupervised Domain Adaptation
Ke Mei, Chuang Zhu, Jiaqi Zou, Shanghang Zhang
[pdf]
[DOI]

Weight Decay Scheduling and Knowledge Distillation for Active Learning
Juseung Yun, Byungjoo Kim, Junmo Kim
[pdf]
[DOI]

HMQ: Hardware Friendly Mixed Precision Quantization Block for CNNs
Hai Victor Habi, Roy H. Jennings, Arnon Netzer
[pdf]
[DOI]

Truncated Inference for Latent Variable Optimization Problems: Application to Robust Estimation and Learning
Christopher Zach, Huu Le
[pdf]
[DOI]

Geometry Constrained Weakly Supervised Object Localization
Weizeng Lu, Xi Jia, Weicheng Xie, Linlin Shen, Yicong Zhou, Jinming Duan
[pdf]
[DOI]

Duality Diagram Similarity: a generic framework for initialization selection in task transfer learning
Kshitij Dwivedi, Jiahui Huang, Radoslaw Martin Cichy, Gemma Roig
[pdf]
[DOI]

OneGAN: Simultaneous Unsupervised Learning of Conditional Image Generation, Foreground Segmentation, and Fine-Grained Clustering
Yaniv Benny, Lior Wolf
[pdf]
[DOI]

Mining self-similarity: Label super-resolution with epitomic representations
Nikolay Malkin, Anthony Ortiz, Nebojsa Jojic
[pdf]
[DOI]

AE-OT-GAN: Training GANs from data specific latent distribution
Dongsheng An, Yang Guo, Min Zhang, Xin Qi, Na Lei, Xianfang Gu
[pdf]
[DOI]

Null-sampling for Interpretable and Fair Representations
Thomas Kehrenberg, Myles Bartlett, Oliver Thomas, Novi Quadrianto
[pdf]
[DOI]

Guiding Monocular Depth Estimation Using Depth-Attention Volume
Lam Huynh, Phong Nguyen-Ha, Jiri Matas, Esa Rahtu, Janne Heikkilä
[pdf]
[DOI]

Tracking Emerges by Looking Around Static Scenes, with Neural 3D Mapping
Adam W. Harley, Shrinidhi Kowshika Lakshmikanth, Paul Schydlo, Katerina Fragkiadaki
[pdf]
[DOI]

Boosting Weakly Supervised Object Detection with Progressive Knowledge Transfer
Yuanyi Zhong, Jianfeng Wang, Jian Peng, Lei Zhang
[pdf]
[DOI]

BézierSketch: A generative model for scalable vector sketches
Ayan Das, Yongxin Yang, Timothy Hospedales, Tao Xiang, Yi-Zhe Song
[pdf]
[DOI]

Semantic Relation Preserving Knowledge Distillation for Image-to-Image Translation
Zeqi Li, Ruowei Jiang,, Parham Aarabi
[pdf]
[DOI]

Domain Adaptation Through Task Distillation
Brady Zhou, Nimit Kalra, Philipp Krähenbühl
[pdf]
[DOI]

PatchAttack: A Black-box Texture-based Attack with Reinforcement Learning
Chenglin Yang, Adam Kortylewski, Cihang Xie, Yinzhi Cao, Alan Yuille
[pdf]
[DOI]

More Classifiers, Less Forgetting: A Generic Multi-classifier Paradigm for Incremental Learning
Yu Liu, Sarah Parisot, Gregory Slabaugh, Xu Jia, Ales Leonardis, Tinne Tuytelaars
[pdf]
[DOI]

Extending and Analyzing Self-Supervised Learning Across Domains
Bram Wallace, Bharath Hariharan
[pdf]
[DOI]

Multi-Source Open-Set Deep Adversarial Domain Adaptation
Sayan Rakshit, Dipesh Tamboli, Pragati Shuddhodhan Meshram, Biplab Banerjee, Gemma Roig, Subhasis Chaudhuri
[pdf]
[DOI]

Neural Batch Sampling with Reinforcement Learning for Semi-Supervised Anomaly Detection
Wen-Hsuan Chu, Kris M. Kitani
[pdf]
[DOI]

LEMMA: A Multi-view Dataset for LEarning Multi-agent Multi-task Activities
Baoxiong Jia, Yixin Chen, Siyuan Huang, Yixin Zhu, Song-Chun Zhu
[pdf]
[DOI]

Teaching Cameras to Feel: Estimating Tactile Physical Properties of Surfaces From Images
Matthew Purri, Kristin Dana
[pdf]
[DOI]

Accurate Optimization of Weighted Nuclear Norm for Non-Rigid Structure from Motion
José Pedro Iglesias, Carl Olsson, Marcus Valtonen Örnhag
[pdf]
[DOI]

Proposal-based Video Completion
Yuan-Ting Hu, Heng Wang, Nicolas Ballas, Kristen Grauman, Alexander G. Schwing
[pdf]
[DOI]

HGNet: Hybrid Generative Network for Zero-shot Domain Adaptation
Haifeng Xia, Zhengming Ding
[pdf]
[DOI]

Beyond Monocular Deraining: Stereo Image Deraining via Semantic Understanding
Kaihao Zhang, Wenhan Luo, Wenqi Ren, Jingwen Wang Fang Zhao, Lin Ma , Hongdong Li
[pdf]
[DOI]

DBQ: A Differentiable Branch Quantizer for Lightweight Deep Neural Networks
Hassan Dbouk, Hetul Sanghvi, Mahesh Mehendale, Naresh Shanbhag
[pdf]
[DOI]

All at Once: Temporally Adaptive Multi-Frame Interpolation with Advanced Motion Modeling
Zhixiang Chi, Rasoul Mohammadi Nasiri, Zheng Liu, Juwei Lu, Jin Tang , Konstantinos N Plataniotis
[pdf]
[DOI]

A Broader Study of Cross-Domain Few-Shot Learning
Yunhui Guo, Noel C. Codella, Leonid Karlinsky, James V. Codella, John R. Smith, Kate Saenko, Tajana Rosing, Rogerio Feris
[pdf]
[DOI]

Practical Poisoning Attacks on Neural Networks
Junfeng Guo, Cong Liu
[pdf]
[DOI]

Unsupervised Domain Adaptation in the Dissimilarity Space for Person Re-identification
Djebril Mekhazni, Amran Bhuiyan, George Ekladious, Eric Granger
[pdf]
[DOI]

Learn distributed GAN with Temporary Discriminators
Hui Qu, Yikai Zhang, Qi Chang, Zhennan Yan, Chao Chen, Dimitris Metaxas
[pdf]
[DOI]

SemifreddoNets: Partially Frozen Neural Networks for Efficient Computer Vision Systems
Leo F Isikdogan, Bhavin V Nayak, Chyuan-Tyng Wu, Joao Peralta Moreira , Sushma Rao, Gilad Michael
[pdf]
[DOI]

Improving Adversarial Robustness by Enforcing Local and Global Compactness
Anh Bui, Trung Le, He Zhao, Paul Montague, Olivier deVel, Tamas Abraham, Dinh Phung
[pdf]
[DOI]

TopoAL: An Adversarial Learning Approach for Topology-Aware Road Segmentation
Subeesh Vasu, Mateusz Kozinski, Leonardo Citraro, and Pascal Fua
[pdf]
[DOI]

Channel selection using Gumbel Softmax
Charles Herrmann, Richard Strong Bowen, Ramin Zabih
[pdf]
[DOI]

Exploiting Temporal Coherence for Self-Supervised One-shot Video Re-identification
Dripta S. Raychaudhuri, Amit K. Roy-Chowdhury
[pdf]
[DOI]

An Efficient Training Framework for Reversible Neural Architectures
Zixuan Jiang, Keren Zhu, Mingjie Liu, Jiaqi Gu, David Z. Pan
[pdf]
[DOI]

Box2Seg: Attention Weighted Loss and Discriminative Feature Learning for Weakly Supervised Segmentation
Viveka Kulharia, Siddhartha Chandra, Amit Agrawal, Philip Torr, Ambrish Tyagi
[pdf]
[DOI]

FreeCam3D: Snapshot Structured Light 3D with Freely-Moving Cameras
Yicheng Wu, Vivek Boominathan, Xuan Zhao, Jacob T. Robinson, Hiroshi Kawasaki, Aswin Sankaranarayanan, Ashok Veeraraghavan
[pdf]
[DOI]

One-Pixel Signature: Characterizing CNN Models for Backdoor Detection
Shanjiaoyang Huang, Weiqi Peng, Zhiwei Jia, Zhuowen Tu
[pdf]
[DOI]

Learning to Transfer Learn: Reinforcement Learning-Based Selection for Adaptive Transfer Learning
Linchao Zhu, Sercan . Arık, Yi Yang, Tomas Pfister
[pdf]
[DOI]

Structure-Aware Generation Network for Recipe Generation from Images
Hao Wang, Guosheng Lin, Steven C. H. Hoi, Chunyan Miao
[pdf]
[DOI]

A Simple and Effective Framework for Pairwise Deep Metric Learning
Qi Qi, Yan Yan, Zixuan Wu, Xiaoyu Wang, Tianbao Yang
[pdf]
[DOI]

Meta-rPPG: Remote Heart Rate Estimation Using a Transductive Meta-Learner
Eugene Lee, Evan Chen, Chen-Yi Lee
[pdf]
[DOI]

A Recurrent Transformer Network for Novel View Action Synthesis
Kara Marie Schatz, Erik Quintanilla, Shruti Vyas, Yogesh S Rawat
[pdf]
[DOI]

Multi-view Action Recognition using Cross-view Video Prediction
Shruti Vyas, Yogesh S Rawat, Mubarak Shah
[pdf]
[DOI]

Learning Discriminative Feature with CRF for Unsupervised Video Object Segmentation
Mingmin Zhen, Shiwei Li, Lei Zhou, Jiaxiang Shang, Haoan Feng, Tian Fang, Long Quan
[pdf]
[DOI]

SMART: Simultaneous Multi-Agent Recurrent Trajectory Prediction
Sriram N N, Buyu Liu, Francesco Pittaluga, Manmohan Chandraker
[pdf]
[DOI]

Label-Driven Reconstruction for Domain Adaptation in Semantic Segmentation
Jinyu Yang, Weizhi An, Sheng Wang, Xinliang Zhu, Chaochao Yan, Junzhou Huang
[pdf]
[DOI]

Efficient Outdoor 3D Point Cloud Semantic Segmentation for Critical Road Objects and Distributed Contexts
Chi-Chong Wong, Chi-Man Vong
[pdf]
[DOI]

Attributional Robustness Training using Input-Gradient Spatial Alignment
Mayank Singh, Nupur Kumari, Puneet Mangla, Abhishek Sinha, Vineeth N Balasubramanian, Balaji Krishnamurthy
[pdf]
[DOI]

Reducing the Sim-to-Real Gap for Event Cameras
Timo Stoffregen, Cedric Scheerlinck, Davide Scaramuzza, Tom Drummond, Nick Barnes, Lindsay Kleeman, Robert Mahony
[pdf]
[DOI]

Spatial Geometric Reasoning for Room Layout Estimation via Deep Reinforcement Learning
Liangliang Ren, Yangyang Song, Jiwen Lu, Jie Zhou
[pdf]
[DOI]

Learning Data Augmentation Strategies for Object Detection
Barret Zoph, Ekin D. Cubuk, Golnaz Ghiasi, Tsung-Yi Lin, Jonathon Shlens, Quoc V. Le
[pdf]
[DOI]

DA-NAS: Data Adapted Pruning for Efficient Neural Architecture Search
Xiyang Dai, Dongdong Chen, Mengchen Liu, Yinpeng Chen, Lu Yuan
[pdf]
[DOI]

A Closer Look at Generalisation in RAVEN
Steven Spratley, Krista Ehinger, Tim Miller
[pdf]
[DOI]

Supervised Edge Attention Network for Accurate Image Instance Segmentation
Xier Chen, Yanchao Lian, Licheng Jiao, Haoran Wang, YanJie Gao, Shi Lingling
[pdf]
[DOI]

Discriminative Partial Domain Adversarial Network
Jian Hu, Hongya Tuo, Chao Wang, Lingfeng Qiao, Haowen Zhong, Junchi Yan, Zhongliang Jing, Henry Leung
[pdf]
[DOI]

Differentiable Programming for Hyperspectral Unmixing using a Physics-based Dispersion Model
John Janiczek, Parth Thaker, Gautam Dasarathy, Christopher S. Edwards , Philip Christensen, Suren Jayasuriya
[pdf]
[DOI]

Deep Cross-species Feature Learning for Animal Face Recognition via Residual Interspecies Equivariant Network
Xiao Shi, Chenxue Yang, Xue Xia, Xiujuan Chai
[pdf]
[DOI]

Guidance and Evaluation: Semantic-Aware Image Inpainting for Mixed Scenes
Liang Liao, Jing Xiao, Zheng Wang, Chia-Wen Lin, Shin’ichi Satoh
[pdf]
[DOI]

Sound2Sight: Generating Visual Dynamics from Sound and Context
Moitreya Chatterjee, Anoop Cherian
[pdf]
[DOI]

3D-CVF: Generating Joint Camera and LiDAR Features Using Cross-View Spatial Feature Fusion for 3D Object Detection
Jin Hyeok Yoo, Yecheol Kim, Jisong Kim, Jun Won Choi
[pdf]
[DOI]

NoiseRank: Unsupervised Label Noise Reduction with Dependence Models
Karishma Sharma, Pinar Donmez, Enming Luo, Yan Liu, I. Zeki Yalniz
[pdf]
[DOI]

Fast Adaptation to Super-Resolution Networks via Meta-Learning
Seobin Park, Jinsu Yoo, Donghyeon Cho, Jiwon Kim, Tae Hyun Kim
[pdf]
[DOI]

TP-LSD: Tri-Points Based Line Segment Detector
Siyu Huang, Fangbo Qin, Pengfei Xiong, Ning Ding, Yijia He, Xiao Liu
[pdf]
[DOI]

SqueezeSegV3: Spatially-Adaptive Convolution for Efficient Point-Cloud Segmentation
Chenfeng Xu, Bichen Wu, Zining Wang, Wei Zhan, Peter Vajda, Kurt Keutzer, Masayoshi Tomizuka
[pdf]
[DOI]

An Attention-driven Two-stage Clustering Method for Unsupervised Person Re-Identification
Zilong Ji, Xiaolong Zou, Xiaohan Lin, Xiao Liu, Tiejun Huang, Si Wu
[pdf]
[DOI]

Toward Fine-grained Facial Expression Manipulation
Jun Ling, Han Xue, Li Song, Shuhui Yang, Rong Xie, Xiao Gu
[pdf]
[DOI]

Adaptive Object Detection with Dual Multi-Label Prediction
Zhen Zhao, Yuhong Guo, Haifeng Shen, Jieping Ye
[pdf]
[DOI]

Table Structure Recognition using Top-Down and Bottom-Up Cues
Sachin Raja, Ajoy Mondal, C V Jawahar
[pdf]
[DOI]

Novel View Synthesis on Unpaired Data by Conditional Deformable Variational Auto-Encoder
Mingyu Yin, Li Sun, Qingli Li
[pdf]
[DOI]

Beyond the Nav-Graph: Vision-and-Language Navigation in Continuous Environments
Jacob Krantz, Erik Wijmans, Arjun Majumdar, Dhruv Batra, Stefan Lee
[pdf]
[DOI]

Boundary Content Graph Neural Network for Temporal Action Proposal Generation
Yueran Bai, Yingying Wang, Yunhai Tong, Yang Yang, Qiyue Liu, Junhui Liu
[pdf]
[DOI]

Pose Augmentation: Class-agnostic Object Pose Transformation for Object Recognition
Yunhao Ge, Jiaping Zhao, Laurent Itti
[pdf]
[DOI]

VLANet: Video-Language Alignment Network for Weakly-Supervised Video Moment Retrieval
Minuk Ma, Sunjae Yoon, Junyeong Kim, Youngjoon Lee, Sunghun Kang, Chang D. Yoo
[pdf]
[DOI]

Attention-Based Query Expansion Learning
Albert Gordo, Filip Radenovic, Tamara Berg
[pdf]
[DOI]

Interpretable Foreground Object Search As Knowledge Distillation
Boren Li, Po-Yu Zhuang, Jian Gu, Mingyang Li, Ping Tan
[pdf]
[DOI]

Improving Knowledge Distillation via Category Structure
Zailiang Chen, Xianxian Zheng, Hailan Shen, Ziyang Zeng, Yukun Zhou, Rongchang Zhao
[pdf]
[DOI]

High Resolution Zero-Shot Domain Adaptation of Synthetically Rendered Face Images
Stephan J. Garbin, Marek Kowalski, Matthew Johnson, Jamie Shotton
[pdf]
[DOI]

Attentive Prototype Few-shot Learning with Capsule Network-based Embedding
Fangyu Wu, Jeremy S.Smith, Wenjin Lu, Chaoyi Pang, Bailing Zhang
[pdf]
[DOI]

Weakly Supervised Instance Segmentation by Learning Annotation Consistent Instances
Aditya Arun, C.V. Jawahar, M. Pawan Kumar
[pdf]
[DOI]

DA4AD: End-to-End Deep Attention-based Visual Localization for Autonomous Driving
Yao Zhou, Guowei Wan, Shenhua Hou, Li Yu, Gang Wang, Xiaofei Rui, Shiyu Song
[pdf]
[DOI]

Visual-Relation Conscious Image Generation from Structured-Text
Duc Minh Vo, Akihiro Sugimoto
[pdf]
[DOI]

Patch-wise Attack for Fooling Deep Neural Network
Lianli Gao, Qilong Zhang, Jingkuan Song, Xianglong Liu, Heng Tao Shen
[pdf]
[DOI]

Feature Pyramid Transformer
Dong Zhang, Hanwang Zhang, Jinhui Tang, Meng Wang, Xiansheng Hua, Qianru Sun
[pdf]
[DOI]

MABNet: A Lightweight Stereo Network Based on Multibranch Adjustable Bottleneck Module
Jiabin Xing, Zhi Qi, Jiying Dong, Jiaxuan Cai, Hao Liu
[pdf]
[DOI]

Guided Saliency Feature Learning for Person Re-identification in Crowded Scenes
Lingxiao He, Wu Liu
[pdf]
[DOI]

Asymmetric Two-Stream Architecture for Accurate RGB-D Saliency Detection
Miao Zhang, Sun Xiao Fei, Jie Liu, Shuang Xu, Yongri Piao, Huchuan Lu
[pdf]
[DOI]

Explaining Image Classifiers using Statistical Fault Localization
Youcheng Sun, Hana Chockler, Xiaowei Huang, Daniel Kroening
[pdf]
[DOI]

Deep Graph Matching via Blackbox Differentiation of Combinatorial Solvers
Michal Rolínek, Paul Swoboda, Dominik Zietlow, Anselm Paulus, Vít Musil, Georg Martius
[pdf]
[DOI]

Learning Video Representations by Transforming Time
Simon Jenni, Givi Meishvili, Paolo Favaro
[pdf]
[DOI]

Unsupervised Monocular Depth Estimation for Night-time Images using Adversarial Domain Feature Adaptation
Madhu Vankadari, Sourav Garg, Anima Majumder, Swagat Kumar, Ardhendu Behera
[pdf]
[DOI]

Variational Connectionist Temporal Classification
Linlin Chao, Jingdong Chen, Wei Chu
[pdf]
[DOI]

End-to-end Dynamic Matching Network for Multi-view Multi-person 3d Pose Estimation
Congzhentao Huang, Shuai Jiang, Yang Li, Ziyue Zhang, Jason Traish, Chen Deng, Sam Ferguson, Richard Yi Da Xu
[pdf]
[DOI]

Orderly Disorder in Point Cloud Domain
Morteza Ghahremani, Bernard Tiddeman, Yonghuai Liu, and Ardhendu Behera
[pdf]
[DOI]

Deep Decomposition Learning for Inverse Imaging Problems
Dongdong Chen, Mike E. Davies
[pdf]
[DOI]

FLOT: Scene Flow on Point Clouds guided by Optimal Transport
Gilles Puy, Alexandre Boulch, Renaud Marlet
[pdf]
[DOI]

Accurate Reconstruction of Oriented 3D Points using Affine Correspondences
Carolina Raposo, Joao P. Barreto
[pdf]
[DOI]

Volumetric Transformer Networks
Seungryong Kim, Sabine Ssstrunk, Mathieu Salzmann
[pdf]
[DOI]

360(o) Camera Alignment via Segmentation
Benjamin Davidson, Mohsan S. Alvi, João F. Henriques
[pdf]
[DOI]

A Novel Line Integral Transform for 2D Affine-Invariant Shape Retrieval
Bin Wang, Yongsheng Gao
[pdf]
[DOI]

Explanation-based Weakly-supervised Learning of Visual Relations with Graph Networks
Federico Baldassarre, Kevin Smith, Josephine Sullivan, Hossein Azizpour
[pdf]
[DOI]

Guided Semantic Flow
Sangryul Jeon, Dongbo Min, Seungryong Kim, Jihwan Choe, Kwanghoon Sohn
[pdf]
[DOI]

Document Structure Extraction using Prior based High Resolution Hierarchical Semantic Segmentation
Mausoom Sarkar, Milan Aggarwal, Arneh Jain, Hiresh Gupta, Balaji Krishnamurthy
[pdf]
[DOI]

Measuring the Importance of Temporal Features in Video Saliency
Matthias Tangemann, Matthias Kümmerer, Thomas S.A. Wallis, Matthias Bethge
[pdf]
[DOI]

Searching Efficient 3D Architectures with Sparse Point-Voxel Convolution
Haotian Tang, Zhijian Liu, Shengyu Zhao, Yujun Lin, Ji Lin, Hanrui Wang, Song Han
[pdf]
[DOI]

Towards Reliable Evaluation of Algorithms for Road Network Reconstruction from Aerial Images
Leonardo Citraro, Mateusz Koziński, Pascal Fua
[pdf]
[DOI]

Online Continual Learning under Extreme Memory Constraints
Enrico Fini, Stéphane Lathuilière, Enver Sangineto, Moin Nabi, Elisa Ricci
[pdf]
[DOI]

Learning to Cluster under Domain Shift
Willi Menapace, Stéphane Lathuilière, Elisa Ricci
[pdf]
[DOI]

Defense Against Adversarial Attacks via Controlling Gradient Leaking on Embedded Manifolds
Yueru Li, Shuyu Cheng, Hang Su, Jun Zhu
[pdf]
[DOI]

Improving Optical Flow on a Pyramid Level
Markus Hofinger, Samuel Rota Bulò, Lorenzo Porzi, Arno Knapitsch, Thomas Pock, Peter Kontschieder
[pdf]
[DOI]

Procrustean Regression Networks: Learning 3D Structure of Non-Rigid Objects from 2D Annotations
Sungheon Park, Minsik Lee, Nojun Kwak
[pdf]
[DOI]

Learning to Learn Parameterized Classification Networks for Scalable Input Images
Duo Li, Anbang Yao, Qifeng Chen
[pdf]
[DOI]

Stereo Event-based Particle Tracking Velocimetry for 3D Fluid Flow Reconstruction
Yuanhao Wang, Ramzi Idoughi, Wolfgang Heidrich
[pdf]
[DOI]

Simplicial Complex based Point Correspondence between Images warped onto Manifolds
Charu Sharma, Manohar Kaul
[pdf]
[DOI]

Representation Learning on Visual-Symbolic Graphs for Video Understanding
Effrosyni Mavroudi, Benjamín Béjar Haro, René Vidal
[pdf]
[DOI]

Distance-Normalized Unified Representation for Monocular 3D Object Detection
Xuepeng Shi, Zhixiang Chen, Tae-Kyun Kim
[pdf]
[DOI]

Sequential Deformation for Accurate Scene Text Detection
Shanyu Xiao, Liangrui Peng, Ruijie Yan, Keyu An, Gang Yao, Jaesik Min
[pdf]
[DOI]

Where to Explore Next? ExHistCNN for History-aware Autonomous 3D Exploration
Yiming Wang, Alessio Del Bue
[pdf]
[DOI]

Semi-Supervised Segmentation based on Error-Correcting Supervision
Robert Mendel, Luis Antonio de Souza Jr, David Rauber, João Paulo Papa, Christoph Palm
[pdf]
[DOI]

Quantum-soft QUBO Suppression for Accurate Object Detection
Junde Li, Swaroop Ghosh
[pdf]
[DOI]

Label-similarity Curriculum Learning
Ürün Dogan, Aniket Anand Deshmukh, Marcin Bronislaw Machura, Christian Igel
[pdf]
[DOI]

Recurrent Image Annotation With Explicit Inter-Label Dependencies
Ayushi Dutta, Yashaswi Verma, C.V. Jawahar
[pdf]
[DOI]

Cross-Attention in Coupled Unmixing Nets for Unsupervised Hyperspectral Super-Resolution
Jing Yao, Danfeng Hong, Jocelyn Chanussot, Deyu Meng, Xiaoxiang Zhu , Zongben Xu
[pdf]
[DOI]

SimPose: Effectively Learning DensePose and Surface Normals of People from Simulated Data
Tyler Zhu, Per Karlsson, Christoph Bregler
[pdf]
[DOI]

ByeGlassesGAN: Identity Preserving Eyeglasses Removal for Face Images
Yu-Hui Lee, Shang-Hong Lai
[pdf]
[DOI]

Differentiable Joint Pruning and Quantization for Hardware Efficiency
Ying Wang, Yadong Lu, Tijmen Blankevoort
[pdf]
[DOI]

Learning to Generate Customized Dynamic 3D Facial Expressions
Rolandos Alexandros Potamias, Jiali Zheng, Stylianos Ploumpis, Giorgos Bouritsas, Evangelos Ververas, Stefanos Zafeiriou
[pdf]
[DOI]

LandscapeAR: Large Scale Outdoor Augmented Reality by Matching Photographs with Terrain Models Using Learned Descriptors
Jan Brejcha, Michal Lukáč, Yannick Hold-Geoffroy, Oliver Wang, Martin Čadík
[pdf]
[DOI]

Learning Disentangled Feature Representation for Hybrid-distorted Image Restoration
Xin Li, Xin Jin, Jianxin Lin, Sen Liu, Yaojun Wu, Tao Yu, Wei Zhou , Zhibo Chen
[pdf]
[DOI]

Jointly De-biasing Face Recognition and Demographic Attribute Estimation
Sixue Gong, Xiaoming Liu, Anil K. Jain
[pdf]
[DOI]

Regularized Loss for Weakly Supervised Single Class Semantic Segmentation
Olga Veksler
[pdf]
[DOI]

Spike-FlowNet: Event-based Optical Flow Estimation with Energy-Efficient Hybrid Neural Networks
Chankyu Lee, Adarsh Kumar Kosta, Alex Zihao Zhu, Kenneth Chaney, Kostas Daniilidis, Kaushik Roy
[pdf]
[DOI]

Forgetting Outside the Box: Scrubbing Deep Networks of Information Accessible from Input-Output Observations
Aditya Golatkar, Alessandro Achille, Stefano Soatto
[pdf]
[DOI]

Inherent Adversarial Robustness of Deep Spiking Neural Networks: Effects of Discrete Input Encoding and Non-Linear Activations
Saima Sharmin, Nitin Rathi, Priyadarshini Panda, Kaushik Roy
[pdf]
[DOI]

Synthesizing Coupled 3D Face Modalities by Trunk-Branch Generative Adversarial Networks
Baris Gecer, Alexandros Lattas, Stylianos Ploumpis, Jiankang Deng, Athanasios Papaioannou, Stylianos Moschoglou, Stefanos Zafeiriou
[pdf]
[DOI]

Learning to Learn Words from Visual Scenes
Dídac Surís, Dave Epstein, Heng Ji, Shih-Fu Chang, Carl Vondrick
[pdf]
[DOI]

On Transferability of Histological Tissue Labels in Computational Pathology
Mahdi S. Hosseini, Lyndon Chan, Weimin Huang, Yichen Wang, Danial Hasan, Corwyn Rowsell, Savvas Damaskinos, Konstantinos N. Plataniotis
[pdf]
[DOI]

Learning Actionness via Long-range Temporal Order Verification
Dimitri Zhukov, Jean-Baptiste Alayrac, Ivan Laptev, Josef Sivic
[pdf]
[DOI]

Fully Embedding Fast Convolutional Networks on Pixel Processor Arrays
Laurie Bose, Piotr Dudek, Jianing Chen, Stephen J. Carey, Walterio W. Mayol-Cuevas
[pdf]
[DOI]

Character Region Attention For Text Spotting
Youngmin Baek, Seung Shin, Jeonghun Baek, Sungrae Park, Junyeop Lee , Daehyun Nam, Hwalsuk Lee
[pdf]
[DOI]

Stable Low-rank Tensor Decomposition for Compression of Convolutional Neural Network
Anh-Huy Phan, Konstantin Sobolev, Konstantin Sozykin, Dmitry Ermilov , Julia Gusak, Petr Tichavský, Valeriy Glukhov, Ivan Oseledets, Andrzej Cichocki
[pdf]
[DOI]

Dual Mixup Regularized Learning for Adversarial Domain Adaptation
Yuan Wu, Diana Inkpen, Ahmed El-Roby
[pdf]
[DOI]

Robust and On-the-fly Dataset Denoising for Image Classification
Jiaming Song, Yann Dauphin, Michael Auli, Tengyu Ma
[pdf]
[DOI]

Imaging Behind Occluders Using Two-Bounce Light
Connor Henley, Tomohiro Maeda, Tristan Swedish, Ramesh Raskar
[pdf]
[DOI]

Improving Object Detection with Selective Self-Supervised Self-Training
Yandong Li, Di Huang, Danfeng Qin, Liqiang Wang, Boqing Gong
[pdf]
[DOI]

Deep Local Shapes: Learning Local SDF Priors for Detailed 3D Reconstruction
Rohan Chabra, Jan E. Lenssen, Eddy Ilg, Tanner Schmidt, Julian Straub, Steven Lovegrove, Richard Newcombe
[pdf]
[DOI]

Info3D: Representation Learning on 3D Objects using Mutual Information Maximization and Contrastive Learning
Aditya Sanghi
[pdf]
[DOI]

Adversarial Data Augmentation via Deformation Statistics
Sahin Olut, Zhengyang Shen, Zhenlin Xu, Samuel Gerber, Marc Niethammer
[pdf]
[DOI]

Neural Predictor for Neural Architecture Search
Wei Wen, Hanxiao Liu, Yiran Chen, Hai Li, Gabriel Bender, Pieter-Jan Kindermans
[pdf]
[DOI]

Learning Permutation Invariant Representations using Memory Networks
Shivam Kalra, Mohammed Adnan, Graham Taylor, H.R. Tizhoosh
[pdf]
[DOI]

Feature Space Augmentation for Long-Tailed Data
Peng Chu, Xiao Bian, Shaopeng Liu, Haibin Ling
[pdf]
[DOI]

Laying the Foundations of Deep Long-Term Crowd Flow Prediction
Samuel S. Sohn, Honglu Zhou, Seonghyeon Moon, Sejong Yoon, Vladimir Pavlovic, Mubbasir Kapadia
[pdf]
[DOI]

Weakly-Supervised Action Localization with Expectation-Maximization Multi-Instance Learning
Zhekun Luo, Devin Guillory, Baifeng Shi, Wei Ke, Fang Wan, Trevor Darrell, Huijuan Xu
[pdf]
[DOI]

Fairness by Learning Orthogonal Disentangled Representations
Mhd Hasan Sarhan, Nassir Navab, Abouzar Eslami, Shadi Albarqouni
[pdf]
[DOI]

Self-supervision with Superpixels: Training Few-shot Medical Image Segmentation without Annotation
Cheng Ouyang, Carlo Biffi, Chen Chen, Turkay Kart, Huaqi Qiu, Daniel Rueckert
[pdf]
[DOI]

On Diverse Asynchronous Activity Anticipation
He Zhao, Richard P. Wildes
[pdf]
[DOI]

Representative-Discriminative Learning for Open-set Land Cover Classification of Satellite Imagery
Razieh Kaviani Baghbaderani, Ying Qu, Hairong Qi, Craig Stutts
[pdf]
[DOI]

Structure-Aware Human-Action Generation
Ping Yu, Yang Zhao, Chunyuan Li, Junsong Yuan, Changyou Chen
[pdf]
[DOI]

Towards Efficient Coarse-to-Fine Networks for Action and Gesture Recognition
Niamul Quader, Juwei Lu, Peng Dai, Wei Li
[pdf]
[DOI]

S³Net: Semantic-Aware Self-supervised Depth Estimation with Monocular Videos and Synthetic Data
Bin Cheng, Inderjot Singh Saggu, Raunak Shah, Gaurav Bansal, Dinesh Bharadia
[pdf]
[DOI]

Leveraging Seen and Unseen Semantic Relationships for Generative Zero-Shot Learning
Maunil R Vyas, Hemanth Venkateswara, Sethuraman Panchanathan
[pdf]
[DOI]

Weight Excitation: Built-in Attention Mechanisms in Convolutional Neural Networks
Niamul Quader, Md Mafijul Islam Bhuiyan, Juwei Lu, Peng Dai, Wei Li
[pdf]
[DOI]

UNITER: UNiversal Image-TExt Representation Learning
Yen-Chun Chen, Linjie Li, Licheng Yu, Ahmed El Kholy Faisal Ahmed, Zhe Gan, Yu Cheng, Jingjing Liu
[pdf]
[DOI]

Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks
Xiujun Li, Xi Yin, Chunyuan Li, Pengchuan Zhang, Xiaowei Hu, Lei Zhang, Lijuan Wang, Houdong Hu, Li Dong, Furu Wei, Yejin Choi, Jianfeng Gao
[pdf]
[DOI]

Improving Face Recognition from Hard Samples via Distribution Distillation Loss
Yuge Huang, Pengcheng Shen, Ying Tai, Shaoxin Li, Xiaoming Liu, Jilin Li, Feiyue Huang, Rongrong Ji
[pdf]
[DOI]

Extract and Merge: Superpixel Segmentation with Regional Attributes
Jianqiao An, Yucheng Shi, Yahong Han, Meijun Sun, Qi Tian
[pdf]
[DOI]

Spatial-Adaptive Network for Single Image Denoising
Meng Chang, Qi Li, Huajun Feng, Zhihai Xu
[pdf]
[DOI]

Physics-based Feature Dehazing Networks
Jiangxin Dong, Jinshan Pan
[pdf]
[DOI]

Learning Surrogates via Deep Embedding
Yash Patel, Tomáš Hodaň, Jiří Matas
[pdf]
[DOI]

An Asymmetric Modeling for Action Assessment
Jibin Gao, Wei-Shi Zheng, Jia-Hui Pan, Chengying Gao, Yaowei Wang, Wei Zeng, Jianhuang Lai
[pdf]
[DOI]

High-quality Single-model Deep Video Compression with Frame-Conv3D and Multi-frame Differential Modulation
Wenyu Sun, Chen Tang, Weigui Li, Zhuqing Yuan, Huazhong Yang, Yongpan Liu
[pdf]
[DOI]

Instance-Aware Embedding for Point Cloud Instance Segmentation
Tong He, Yifan Liu, Chunhua Shen, Xinlong Wang, Changming Sun
[pdf]
[DOI]

Self-Paced Deep Regression Forests with Consideration on Underrepresented Examples
Lili Pan, Shijie Ai, Yazhou Ren, Zenglin Xu
[pdf]
[DOI]

Manifold Projection for Adversarial Defense on Face Recognition
Jianli Zhou, Chao Liang, Jun Chen
[pdf]
[DOI]

Weakly Supervised Learning with Side Information for Noisy Labeled Images
Lele Cheng, Xiangzeng Zhou, Liming Zhao, Dangwei Li, Hong Shang, Yun Zheng, Pan Pan, Yinghui Xu
[pdf]
[DOI]

Not only Look, but also Listen: Learning Multimodal Violence Detection under Weak Supervision
Peng Wu, Jing Liu, Yujia Shi, Yujia Sun, Fangtao Shao, Zhaoyang Wu , Zhiwei Yang
[pdf]
[DOI]

SNE-RoadSeg: Incorporating Surface Normal Information into Semantic Segmentation for Accurate Freespace Detection
Rui Fan, Hengli Wang, Peide Cai, Ming Liu
[pdf]
[DOI]

Modeling the Space of Point Landmark Constrained Diffeomorphisms
Chengfeng Wen, Yang Guo, Xianfeng Gu
[pdf]
[DOI]

PieNet: Personalized Image Enhancement Network
Han-Ul Kim, Young Jun Koh, Chang-Su Kim
[pdf]
[DOI]

Rotational Outlier Identification in Pose Graphs Using Dual Decomposition
Arman Karimian, Ziqi Yang, Roberto Tron
[pdf]
[DOI]

Speech-driven Facial Animation using Cascaded GANs for Learning of Motion and Texture
Dipanjan Das, Sandika Biswas, Sanjana Sinha, Brojeshwar Bhowmick
[pdf]
[DOI]

Solving Phase Retrieval with a Learned Reference
Rakib Hyder, Zikui Cai, M. Salman Asif
[pdf]
[DOI]

Dual Grid Net: Hand Mesh Vertex Regression from Single Depth Maps
Chengde Wan, Thomas Probst, Luc Van Gool, Angela Yao
[pdf]
[DOI]

Modeling Varying Camera-IMU Time Offset in Optimization-Based Visual-Inertial Odometry
Ling, Yonggen and Bao, Linchao and Jie, Zequn and Zhu, Fengming and Li, Ziyang and Tang, Shanmin and Liu, Yongsheng and Liu, Wei and Zhang, Tong
[pdf]
[bibtex]
@InProceedings{Ling_2018_ECCV,
author = {Ling, Yonggen and Bao, Linchao and Jie, Zequn and Zhu, Fengming and Li, Ziyang and Tang, Shanmin and Liu, Yongsheng and Liu, Wei and Zhang, Tong},
title = {Modeling Varying Camera-IMU Time Offset in Optimization-Based Visual-Inertial Odometry},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Pose Partition Networks for Multi-Person Pose Estimation
Nie, Xuecheng and Feng, Jiashi and Xing, Junliang and Yan, Shuicheng
[pdf]
[bibtex]
@InProceedings{Nie_2018_ECCV,
author = {Nie, Xuecheng and Feng, Jiashi and Xing, Junliang and Yan, Shuicheng},
title = {Pose Partition Networks for Multi-Person Pose Estimation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Consensus-Driven Propagation in Massive Unlabeled Data for Face Recognition
Zhan, Xiaohang and Liu, Ziwei and Yan, Junjie and Lin, Dahua and Change Loy, Chen
[pdf]
[bibtex]
@InProceedings{Zhan_2018_ECCV,
author = {Zhan, Xiaohang and Liu, Ziwei and Yan, Junjie and Lin, Dahua and Change Loy, Chen},
title = {Consensus-Driven Propagation in Massive Unlabeled Data for Face Recognition},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Open-World Stereo Video Matching with Deep RNN
Zhong, Yiran and Li, Hongdong and Dai, Yuchao
[pdf]
[bibtex]
@InProceedings{Zhong_2018_ECCV,
author = {Zhong, Yiran and Li, Hongdong and Dai, Yuchao},
title = {Open-World Stereo Video Matching with Deep RNN},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deep Cross-Modal Projection Learning for Image-Text Matching
Zhang, Ying and Lu, Huchuan
[pdf]
[bibtex]
@InProceedings{Zhang_2018_ECCV,
author = {Zhang, Ying and Lu, Huchuan},
title = {Deep Cross-Modal Projection Learning for Image-Text Matching},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Gray-box Adversarial Training
Vivek, B. S. and Reddy Mopuri, Konda and Venkatesh Babu, R.
[pdf]
[bibtex]
@InProceedings{Vivek_2018_ECCV,
author = {Vivek, B. S. and Reddy Mopuri, Konda and Venkatesh Babu, R.},
title = {Gray-box Adversarial Training},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Multi-Class Model Fitting by Energy Minimization and Mode-Seeking
Barath, Daniel and Matas, Jiri
[pdf]
[bibtex]
@InProceedings{Barath_2018_ECCV,
author = {Barath, Daniel and Matas, Jiri},
title = {Multi-Class Model Fitting by Energy Minimization and Mode-Seeking},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

MRF Optimization with Separable Convex Prior on Partially Ordered Labels
Domokos, Csaba and Schmidt, Frank R. and Cremers, Daniel
[pdf]
[bibtex]
@InProceedings{Domokos_2018_ECCV,
author = {Domokos, Csaba and Schmidt, Frank R. and Cremers, Daniel},
title = {MRF Optimization with Separable Convex Prior on Partially Ordered Labels},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

VQA-E: Explaining, Elaborating, and Enhancing Your Answers for Visual Questions
Li, Qing and Tao, Qingyi and Joty, Shafiq and Cai, Jianfei and Luo, Jiebo
[pdf]
[bibtex]
@InProceedings{Li_2018_ECCV,
author = {Li, Qing and Tao, Qingyi and Joty, Shafiq and Cai, Jianfei and Luo, Jiebo},
title = {VQA-E: Explaining, Elaborating, and Enhancing Your Answers for Visual Questions},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Context Refinement for Object Detection
Chen, Zhe and Huang, Shaoli and Tao, Dacheng
[pdf]
[bibtex]
@InProceedings{Chen_2018_ECCV,
author = {Chen, Zhe and Huang, Shaoli and Tao, Dacheng},
title = {Context Refinement for Object Detection},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Depth Estimation via Affinity Learned with Convolutional Spatial Propagation Network
Cheng, Xinjing and Wang, Peng and Yang, Ruigang
[pdf]
[bibtex]
@InProceedings{Cheng_2018_ECCV,
author = {Cheng, Xinjing and Wang, Peng and Yang, Ruigang},
title = {Depth Estimation via Affinity Learned with Convolutional Spatial Propagation Network},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Zero-Annotation Object Detection with Web Knowledge Transfer
Tao, Qingyi and Yang, Hao and Cai, Jianfei
[pdf]
[bibtex]
@InProceedings{Tao_2018_ECCV,
author = {Tao, Qingyi and Yang, Hao and Cai, Jianfei},
title = {Zero-Annotation Object Detection with Web Knowledge Transfer},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Fast Light Field Reconstruction With Deep Coarse-To-Fine Modeling of Spatial-Angular Clues
Wing Fung Yeung, Henry and Hou, Junhui and Chen, Jie and Ying Chung, Yuk and Chen, Xiaoming
[pdf]
[bibtex]
@InProceedings{Yeung_2018_ECCV,
author = {Wing Fung Yeung, Henry and Hou, Junhui and Chen, Jie and Ying Chung, Yuk and Chen, Xiaoming},
title = {Fast Light Field Reconstruction With Deep Coarse-To-Fine Modeling of Spatial-Angular Clues},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

AGIL: Learning Attention from Human for Visuomotor Tasks
Zhang, Ruohan and Liu, Zhuode and Zhang, Luxin and Whritner, Jake A. and Muller, Karl S. and Hayhoe, Mary M. and Ballard, Dana H.
[pdf]
[bibtex]
@InProceedings{Zhang_2018_ECCV,
author = {Zhang, Ruohan and Liu, Zhuode and Zhang, Luxin and Whritner, Jake A. and Muller, Karl S. and Hayhoe, Mary M. and Ballard, Dana H.},
title = {AGIL: Learning Attention from Human for Visuomotor Tasks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Physical Primitive Decomposition
Liu, Zhijian and Freeman, William T. and Tenenbaum, Joshua B. and Wu, Jiajun
[pdf]
[bibtex]
@InProceedings{Liu_2018_ECCV,
author = {Liu, Zhijian and Freeman, William T. and Tenenbaum, Joshua B. and Wu, Jiajun},
title = {Physical Primitive Decomposition},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deep Expander Networks: Efficient Deep Networks from Graph Theory
Prabhu, Ameya and Varma, Girish and Namboodiri, Anoop
[pdf]
[bibtex]
@InProceedings{Prabhu_2018_ECCV,
author = {Prabhu, Ameya and Varma, Girish and Namboodiri, Anoop},
title = {Deep Expander Networks: Efficient Deep Networks from Graph Theory},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Real-Time MDNet
Jung, Ilchae and Son, Jeany and Baek, Mooyeol and Han, Bohyung
[pdf]
[bibtex]
@InProceedings{Jung_2018_ECCV,
author = {Jung, Ilchae and Son, Jeany and Baek, Mooyeol and Han, Bohyung},
title = {Real-Time MDNet},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

The Mutex Watershed: Efficient, Parameter-Free Image Partitioning
Wolf, Steffen and Pape, Constantin and Bailoni, Alberto and Rahaman, Nasim and Kreshuk, Anna and Kothe, Ullrich and Hamprecht, FredA.
[pdf]
[bibtex]
@InProceedings{Wolf_2018_ECCV,
author = {Wolf, Steffen and Pape, Constantin and Bailoni, Alberto and Rahaman, Nasim and Kreshuk, Anna and Kothe, Ullrich and Hamprecht, FredA.},
title = {The Mutex Watershed: Efficient, Parameter-Free Image Partitioning},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

MVSNet: Depth Inference for Unstructured Multi-view Stereo
Yao, Yao and Luo, Zixin and Li, Shiwei and Fang, Tian and Quan, Long
[pdf]
[bibtex]
@InProceedings{Yao_2018_ECCV,
author = {Yao, Yao and Luo, Zixin and Li, Shiwei and Fang, Tian and Quan, Long},
title = {MVSNet: Depth Inference for Unstructured Multi-view Stereo},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Audio-Visual Event Localization in Unconstrained Videos
Tian, Yapeng and Shi, Jing and Li, Bochen and Duan, Zhiyao and Xu, Chenliang
[pdf]
[bibtex]
@InProceedings{Tian_2018_ECCV,
author = {Tian, Yapeng and Shi, Jing and Li, Bochen and Duan, Zhiyao and Xu, Chenliang},
title = {Audio-Visual Event Localization in Unconstrained Videos},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Attend and Rectify: a gated attention mechanism for fine-grained recovery
Rodriguez, Pau and Gonfaus, Josep M. and Cucurull, Guillem and XavierRoca, F. and Gonzalez, Jordi
[pdf]
[bibtex]
@InProceedings{Rodriguez_2018_ECCV,
author = {Rodriguez, Pau and Gonfaus, Josep M. and Cucurull, Guillem and XavierRoca, F. and Gonzalez, Jordi},
title = {Attend and Rectify: a gated attention mechanism for fine-grained recovery},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

PyramidBox: A Context-assisted Single Shot Face Detector
Tang, Xu and Du, Daniel K. and He, Zeqiang and Liu, Jingtuo
[pdf]
[bibtex]
@InProceedings{Tang_2018_ECCV,
author = {Tang, Xu and Du, Daniel K. and He, Zeqiang and Liu, Jingtuo},
title = {PyramidBox: A Context-assisted Single Shot Face Detector},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

RT-GENE: Real-Time Eye Gaze Estimation in Natural Environments
Fischer, Tobias and Jin Chang, Hyung and Demiris, Yiannis
[pdf]
[bibtex]
@InProceedings{Fischer_2018_ECCV,
author = {Fischer, Tobias and Jin Chang, Hyung and Demiris, Yiannis},
title = {RT-GENE: Real-Time Eye Gaze Estimation in Natural Environments},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Contemplating Visual Emotions: Understanding and Overcoming Dataset Bias
Panda, Rameswar and Zhang, Jianming and Li, Haoxiang and Lee, Joon-Young and Lu, Xin and Roy-Chowdhury, Amit K.
[pdf]
[bibtex]
@InProceedings{Panda_2018_ECCV,
author = {Panda, Rameswar and Zhang, Jianming and Li, Haoxiang and Lee, Joon-Young and Lu, Xin and Roy-Chowdhury, Amit K.},
title = {Contemplating Visual Emotions: Understanding and Overcoming Dataset Bias},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Highly-Economized Multi-View Binary Compression for Scalable Image Clustering
Zhang, Zheng and Liu, Li and Qin, Jie and Zhu, Fan and Shen, Fumin and Xu, Yong and Shao, Ling and Tao Shen, Heng
[pdf]
[bibtex]
@InProceedings{Zhang_2018_ECCV,
author = {Zhang, Zheng and Liu, Li and Qin, Jie and Zhu, Fan and Shen, Fumin and Xu, Yong and Shao, Ling and Tao Shen, Heng},
title = {Highly-Economized Multi-View Binary Compression for Scalable Image Clustering},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deep Kalman Filtering Network for Video Compression Artifact Reduction
Lu, Guo and Ouyang, Wanli and Xu, Dong and Zhang, Xiaoyun and Gao, Zhiyong and Sun, Ming-Ting
[pdf]
[bibtex]
@InProceedings{Lu_2018_ECCV,
author = {Lu, Guo and Ouyang, Wanli and Xu, Dong and Zhang, Xiaoyun and Gao, Zhiyong and Sun, Ming-Ting},
title = {Deep Kalman Filtering Network for Video Compression Artifact Reduction},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

DeepGUM: Learning Deep Robust Regression with a Gaussian-Uniform Mixture Model
Lathuiliere, Stephane and Mesejo, Pablo and Alameda-Pineda, Xavier and Horaud, Radu
[pdf]
[bibtex]
@InProceedings{Lathuiliere_2018_ECCV,
author = {Lathuiliere, Stephane and Mesejo, Pablo and Alameda-Pineda, Xavier and Horaud, Radu},
title = {DeepGUM: Learning Deep Robust Regression with a Gaussian-Uniform Mixture Model},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

ISNN: Impact Sound Neural Network for Audio-Visual Object Classification
Sterling, Auston and Wilson, Justin and Lowe, Sam and Lin, Ming C.
[pdf]
[bibtex]
@InProceedings{Sterling_2018_ECCV,
author = {Sterling, Auston and Wilson, Justin and Lowe, Sam and Lin, Ming C.},
title = {ISNN: Impact Sound Neural Network for Audio-Visual Object Classification},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deep Cross-modality Adaptation via Semantics Preserving Adversarial Learning for Sketch-based 3D Shape Retrieval
Chen, Jiaxin and Fang, Yi
[pdf]
[bibtex]
@InProceedings{Chen_2018_ECCV,
author = {Chen, Jiaxin and Fang, Yi},
title = {Deep Cross-modality Adaptation via Semantics Preserving Adversarial Learning for Sketch-based 3D Shape Retrieval},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Learning to Blend Photos
Hung, Wei-Chih and Zhang, Jianming and Shen, Xiaohui and Lin, Zhe and Lee, Joon-Young and Yang, Ming-Hsuan
[pdf]
[bibtex]
@InProceedings{Hung_2018_ECCV,
author = {Hung, Wei-Chih and Zhang, Jianming and Shen, Xiaohui and Lin, Zhe and Lee, Joon-Young and Yang, Ming-Hsuan},
title = {Learning to Blend Photos},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Second-order Democratic Aggregation
Lin, Tsung-Yu and Maji, Subhransu and Koniusz, Piotr
[pdf]
[bibtex]
@InProceedings{Lin_2018_ECCV,
author = {Lin, Tsung-Yu and Maji, Subhransu and Koniusz, Piotr},
title = {Second-order Democratic Aggregation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Recurrent Fusion Network for Image captioning
Jiang, Wenhao and Ma, Lin and Jiang, Yu-Gang and Liu, Wei and Zhang, Tong
[pdf]
[bibtex]
@InProceedings{Jiang_2018_ECCV,
author = {Jiang, Wenhao and Ma, Lin and Jiang, Yu-Gang and Liu, Wei and Zhang, Tong},
title = {Recurrent Fusion Network for Image captioning},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Grounding Visual Explanations
Anne Hendricks, Lisa and Hu, Ronghang and Darrell, Trevor and Akata, Zeynep
[pdf]
[bibtex]
@InProceedings{Hendricks_2018_ECCV,
author = {Anne Hendricks, Lisa and Hu, Ronghang and Darrell, Trevor and Akata, Zeynep},
title = {Grounding Visual Explanations},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

A Dataset of Flash and Ambient Illumination Pairs from the Crowd
Aksoy, Yagiz and Kim, Changil and Kellnhofer, Petr and Paris, Sylvain and Elgharib, Mohamed and Pollefeys, Marc and Matusik, Wojciech
[pdf]
[bibtex]
@InProceedings{Aksoy_2018_ECCV,
author = {Aksoy, Yagiz and Kim, Changil and Kellnhofer, Petr and Paris, Sylvain and Elgharib, Mohamed and Pollefeys, Marc and Matusik, Wojciech},
title = {A Dataset of Flash and Ambient Illumination Pairs from the Crowd},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deep Continuous Fusion for Multi-Sensor 3D Object Detection
Liang, Ming and Yang, Bin and Wang, Shenlong and Urtasun, Raquel
[pdf]
[bibtex]
@InProceedings{Liang_2018_ECCV,
author = {Liang, Ming and Yang, Bin and Wang, Shenlong and Urtasun, Raquel},
title = {Deep Continuous Fusion for Multi-Sensor 3D Object Detection},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

BusterNet: Detecting Copy-Move Image Forgery with Source/Target Localization
Wu, Yue and Abd-Almageed, Wael and Natarajan, Prem
[pdf]
[bibtex]
@InProceedings{Wu_2018_ECCV,
author = {Wu, Yue and Abd-Almageed, Wael and Natarajan, Prem},
title = {BusterNet: Detecting Copy-Move Image Forgery with Source/Target Localization},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Parallel Feature Pyramid Network for Object Detection
Kim, Seung-Wook and Kook, Hyong-Keun and Sun, Jee-Young and Kang, Mun-Cheon and Ko, Sung-Jea
[pdf]
[bibtex]
@InProceedings{Kim_2018_ECCV,
author = {Kim, Seung-Wook and Kook, Hyong-Keun and Sun, Jee-Young and Kang, Mun-Cheon and Ko, Sung-Jea},
title = {Parallel Feature Pyramid Network for Object Detection},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Learning Region Features for Object Detection
Gu, Jiayuan and Hu, Han and Wang, Liwei and Wei, Yichen and Dai, Jifeng
[pdf]
[bibtex]
@InProceedings{Gu_2018_ECCV,
author = {Gu, Jiayuan and Hu, Han and Wang, Liwei and Wei, Yichen and Dai, Jifeng},
title = {Learning Region Features for Object Detection},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

AMC: AutoML for Model Compression and Acceleration on Mobile Devices
He, Yihui and Lin, Ji and Liu, Zhijian and Wang, Hanrui and Li, Li-Jia and Han, Song
[pdf]
[bibtex]
@InProceedings{He_2018_ECCV,
author = {He, Yihui and Lin, Ji and Liu, Zhijian and Wang, Hanrui and Li, Li-Jia and Han, Song},
title = {AMC: AutoML for Model Compression and Acceleration on Mobile Devices},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

PSDF Fusion: Probabilistic Signed Distance Function for On-the-fly 3D Data Fusion and Scene Reconstruction
Dong, Wei and Wang, Qiuyuan and Wang, Xin and Zha, Hongbin
[pdf]
[bibtex]
@InProceedings{Dong_2018_ECCV,
author = {Dong, Wei and Wang, Qiuyuan and Wang, Xin and Zha, Hongbin},
title = {PSDF Fusion: Probabilistic Signed Distance Function for On-the-fly 3D Data Fusion and Scene Reconstruction},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Penalizing Top Performers: Conservative Loss for Semantic Segmentation Adaptation
Zhu, Xinge and Zhou, Hui and Yang, Ceyuan and Shi, Jianping and Lin, Dahua
[pdf]
[bibtex]
@InProceedings{Zhu_2018_ECCV,
author = {Zhu, Xinge and Zhou, Hui and Yang, Ceyuan and Shi, Jianping and Lin, Dahua},
title = {Penalizing Top Performers: Conservative Loss for Semantic Segmentation Adaptation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Switchable Temporal Propagation Network
Liu, Sifei and Zhong, Guangyu and De Mello, Shalini and Gu, Jinwei and Jampani, Varun and Yang, Ming-Hsuan and Kautz, Jan
[pdf]
[bibtex]
@InProceedings{Liu_2018_ECCV,
author = {Liu, Sifei and Zhong, Guangyu and De Mello, Shalini and Gu, Jinwei and Jampani, Varun and Yang, Ming-Hsuan and Kautz, Jan},
title = {Switchable Temporal Propagation Network},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Sampling Algebraic Varieties for Robust Camera Autocalibration
Pani Paudel, Danda and Van Gool, Luc
[pdf]
[bibtex]
@InProceedings{Paudel_2018_ECCV,
author = {Pani Paudel, Danda and Van Gool, Luc},
title = {Sampling Algebraic Varieties for Robust Camera Autocalibration},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Image Reassembly Combining Deep Learning and Shortest Path Problem
Paumard, Marie-Morgane and Picard, David and Tabia, Hedi
[pdf]
[bibtex]
@InProceedings{Paumard_2018_ECCV,
author = {Paumard, Marie-Morgane and Picard, David and Tabia, Hedi},
title = {Image Reassembly Combining Deep Learning and Shortest Path Problem},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Diverse Conditional Image Generation by Stochastic Regression with Latent Drop-Out Codes
He, Yang and Schiele, Bernt and Fritz, Mario
[pdf]
[bibtex]
@InProceedings{He_2018_ECCV,
author = {He, Yang and Schiele, Bernt and Fritz, Mario},
title = {Diverse Conditional Image Generation by Stochastic Regression with Latent Drop-Out Codes},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Incremental Non-Rigid Structure-from-Motion with Unknown Focal Length
Probst, Thomas and Pani Paudel, Danda and Chhatkuli, Ajad and Van Gool, Luc
[pdf]
[bibtex]
@InProceedings{Probst_2018_ECCV,
author = {Probst, Thomas and Pani Paudel, Danda and Chhatkuli, Ajad and Van Gool, Luc},
title = {Incremental Non-Rigid Structure-from-Motion with Unknown Focal Length},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

PS-FCN: A Flexible Learning Framework for Photometric Stereo
Chen, Guanying and Han, Kai and Wong, Kwan-Yee K.
[pdf]
[bibtex]
@InProceedings{Chen_2018_ECCV,
author = {Chen, Guanying and Han, Kai and Wong, Kwan-Yee K.},
title = {PS-FCN: A Flexible Learning Framework for Photometric Stereo},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Instance-level Human Parsing via Part Grouping Network
Gong, Ke and Liang, Xiaodan and Li, Yicheng and Chen, Yimin and Yang, Ming and Lin, Liang
[pdf]
[bibtex]
@InProceedings{Gong_2018_ECCV,
author = {Gong, Ke and Liang, Xiaodan and Li, Yicheng and Chen, Yimin and Yang, Ming and Lin, Liang},
title = {Instance-level Human Parsing via Part Grouping Network},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Normalized Blind Deconvolution
Jin, Meiguang and Roth, Stefan and Favaro, Paolo
[pdf]
[bibtex]
@InProceedings{Jin_2018_ECCV,
author = {Jin, Meiguang and Roth, Stefan and Favaro, Paolo},
title = {Normalized Blind Deconvolution},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Constrained Optimization Based Low-Rank Approximation of Deep Neural Networks
Li, Chong and Richard Shi, C. J.
[pdf]
[bibtex]
@InProceedings{Li_2018_ECCV,
author = {Li, Chong and Richard Shi, C. J.},
title = {Constrained Optimization Based Low-Rank Approximation of Deep Neural Networks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Dense Pose Transfer
Neverova, Natalia and Alp Guler, Riza and Kokkinos, Iasonas
[pdf]
[bibtex]
@InProceedings{Neverova_2018_ECCV,
author = {Neverova, Natalia and Alp Guler, Riza and Kokkinos, Iasonas},
title = {Dense Pose Transfer},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

RCAA: Relational Context-Aware Agents for Person Search
Chang, Xiaojun and Huang, Po-Yao and Shen, Yi-Dong and Liang, Xiaodan and Yang, Yi and Hauptmann, Alexander G.
[pdf]
[bibtex]
@InProceedings{Chang_2018_ECCV,
author = {Chang, Xiaojun and Huang, Po-Yao and Shen, Yi-Dong and Liang, Xiaodan and Yang, Yi and Hauptmann, Alexander G.},
title = {RCAA: Relational Context-Aware Agents for Person Search},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deep Discriminative Model for Video Classification
Tavakolian, Mohammad and Hadid, Abdenour
[pdf]
[bibtex]
@InProceedings{Tavakolian_2018_ECCV,
author = {Tavakolian, Mohammad and Hadid, Abdenour},
title = {Deep Discriminative Model for Video Classification},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

DeepKSPD: Learning Kernel-matrix-based SPD Representation for Fine-grained Image Recognition
Engin, Melih and Wang, Lei and Zhou, Luping and Liu, Xinwang
[pdf]
[bibtex]
@InProceedings{Engin_2018_ECCV,
author = {Engin, Melih and Wang, Lei and Zhou, Luping and Liu, Xinwang},
title = {DeepKSPD: Learning Kernel-matrix-based SPD Representation for Fine-grained Image Recognition},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deep Pictorial Gaze Estimation
Park, Seonwook and Spurr, Adrian and Hilliges, Otmar
[pdf]
[bibtex]
@InProceedings{Park_2018_ECCV,
author = {Park, Seonwook and Spurr, Adrian and Hilliges, Otmar},
title = {Deep Pictorial Gaze Estimation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

CTAP: Complementary Temporal Action Proposal Generation
Gao, Jiyang and Chen, Kan and Nevatia, Ram
[pdf]
[bibtex]
@InProceedings{Gao_2018_ECCV,
author = {Gao, Jiyang and Chen, Kan and Nevatia, Ram},
title = {CTAP: Complementary Temporal Action Proposal Generation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Neural Network Encapsulation
Li, Hongyang and Guo, Xiaoyang and DaiWanli Ouyang, Bo and Wang, Xiaogang
[pdf]
[bibtex]
@InProceedings{Li_2018_ECCV,
author = {Li, Hongyang and Guo, Xiaoyang and DaiWanli Ouyang, Bo and Wang, Xiaogang},
title = {Neural Network Encapsulation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Recovering 3D Planes from a Single Image via Convolutional Neural Networks
Yang, Fengting and Zhou, Zihan
[pdf]
[bibtex]
@InProceedings{Yang_2018_ECCV,
author = {Yang, Fengting and Zhou, Zihan},
title = {Recovering 3D Planes from a Single Image via Convolutional Neural Networks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Dist-GAN: An Improved GAN using Distance Constraints
Tran, Ngoc-Trung and Bui, Tuan-Anh and Cheung, Ngai-Man
[pdf]
[bibtex]
@InProceedings{Tran_2018_ECCV,
author = {Tran, Ngoc-Trung and Bui, Tuan-Anh and Cheung, Ngai-Man},
title = {Dist-GAN: An Improved GAN using Distance Constraints},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Retrospective Encoders for Video Summarization
Zhang, Ke and Grauman, Kristen and Sha, Fei
[pdf]
[bibtex]
@InProceedings{Zhang_2018_ECCV,
author = {Zhang, Ke and Grauman, Kristen and Sha, Fei},
title = {Retrospective Encoders for Video Summarization},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Tracking Emerges by Colorizing Videos
Vondrick, Carl and Shrivastava, Abhinav and Fathi, Alireza and Guadarrama, Sergio and Murphy, Kevin
[pdf]
[bibtex]
@InProceedings{Vondrick_2018_ECCV,
author = {Vondrick, Carl and Shrivastava, Abhinav and Fathi, Alireza and Guadarrama, Sergio and Murphy, Kevin},
title = {Tracking Emerges by Colorizing Videos},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Task-Aware Image Downscaling
Kim, Heewon and Choi, Myungsub and Lim, Bee and Mu Lee, Kyoung
[pdf]
[bibtex]
@InProceedings{Kim_2018_ECCV,
author = {Kim, Heewon and Choi, Myungsub and Lim, Bee and Mu Lee, Kyoung},
title = {Task-Aware Image Downscaling},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Product Quantization Network for Fast Image Retrieval
Yu, Tan and Yuan, Junsong and Fang, Chen and Jin, Hailin
[pdf]
[bibtex]
@InProceedings{Yu_2018_ECCV,
author = {Yu, Tan and Yuan, Junsong and Fang, Chen and Jin, Hailin},
title = {Product Quantization Network for Fast Image Retrieval},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Supervising the new with the old: learning SFM from SFM
Klodt, Maria and Vedaldi, Andrea
[pdf]
[bibtex]
@InProceedings{Klodt_2018_ECCV,
author = {Klodt, Maria and Vedaldi, Andrea},
title = {Supervising the new with the old: learning SFM from SFM},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Towards End-to-End License Plate Detection and Recognition: A Large Dataset and Baseline
Xu, Zhenbo and Yang, Wei and Meng, Ajin and Lu, Nanxue and Huang, Huan and Ying, Changchun and Huang, Liusheng
[pdf]
[bibtex]
@InProceedings{Xu_2018_ECCV,
author = {Xu, Zhenbo and Yang, Wei and Meng, Ajin and Lu, Nanxue and Huang, Huan and Ying, Changchun and Huang, Liusheng},
title = {Towards End-to-End License Plate Detection and Recognition: A Large Dataset and Baseline},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Ask, Acquire, and Attack: Data-free UAP Generation using Class Impressions
Reddy Mopuri, Konda and Krishna Uppala, Phani and Venkatesh Babu, R.
[pdf]
[bibtex]
@InProceedings{Mopuri_2018_ECCV,
author = {Reddy Mopuri, Konda and Krishna Uppala, Phani and Venkatesh Babu, R.},
title = {Ask, Acquire, and Attack: Data-free UAP Generation using Class Impressions},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Separating Reflection and Transmission Images in the Wild
Wieschollek, Patrick and Gallo, Orazio and Gu, Jinwei and Kautz, Jan
[pdf]
[bibtex]
@InProceedings{Wieschollek_2018_ECCV,
author = {Wieschollek, Patrick and Gallo, Orazio and Gu, Jinwei and Kautz, Jan},
title = {Separating Reflection and Transmission Images in the Wild},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Hard-Aware Point-to-Set Deep Metric for Person Re-identification
Yu, Rui and Dou, Zhiyong and Bai, Song and Zhang, Zhaoxiang and Xu, Yongchao and Bai, Xiang
[pdf]
[bibtex]
@InProceedings{Yu_2018_ECCV,
author = {Yu, Rui and Dou, Zhiyong and Bai, Song and Zhang, Zhaoxiang and Xu, Yongchao and Bai, Xiang},
title = {Hard-Aware Point-to-Set Deep Metric for Person Re-identification},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Cross-Modal and Hierarchical Modeling of Video and Text
Zhang, Bowen and Hu, Hexiang and Sha, Fei
[pdf]
[bibtex]
@InProceedings{Zhang_2018_ECCV,
author = {Zhang, Bowen and Hu, Hexiang and Sha, Fei},
title = {Cross-Modal and Hierarchical Modeling of Video and Text},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

StarMap for Category-Agnostic Keypoint and Viewpoint Estimation
Zhou, Xingyi and Karpur, Arjun and Luo, Linjie and Huang, Qixing
[pdf]
[bibtex]
@InProceedings{Zhou_2018_ECCV,
author = {Zhou, Xingyi and Karpur, Arjun and Luo, Linjie and Huang, Qixing},
title = {StarMap for Category-Agnostic Keypoint and Viewpoint Estimation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Improving DNN Robustness to Adversarial Attacks using Jacobian Regularization
Jakubovitz, Daniel and Giryes, Raja
[pdf]
[bibtex]
@InProceedings{Jakubovitz_2018_ECCV,
author = {Jakubovitz, Daniel and Giryes, Raja},
title = {Improving DNN Robustness to Adversarial Attacks using Jacobian Regularization},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

RelocNet: Continuous Metric Learning Relocalisation using Neural Nets
Balntas, Vassileios and Li, Shuda and Prisacariu, Victor
[pdf]
[bibtex]
@InProceedings{Balntas_2018_ECCV,
author = {Balntas, Vassileios and Li, Shuda and Prisacariu, Victor},
title = {RelocNet: Continuous Metric Learning Relocalisation using Neural Nets},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Mancs: A Multi-task Attentional Network with Curriculum Sampling for Person Re-identification
Wang, Cheng and Zhang, Qian and Huang, Chang and Liu, Wenyu and Wang, Xinggang
[pdf]
[bibtex]
@InProceedings{Wang_2018_ECCV,
author = {Wang, Cheng and Zhang, Qian and Huang, Chang and Liu, Wenyu and Wang, Xinggang},
title = {Mancs: A Multi-task Attentional Network with Curriculum Sampling for Person Re-identification},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Recurrent Tubelet Proposal and Recognition Networks for Action Detection
Li, Dong and Qiu, Zhaofan and Dai, Qi and Yao, Ting and Mei, Tao
[pdf]
[bibtex]
@InProceedings{Li_2018_ECCV,
author = {Li, Dong and Qiu, Zhaofan and Dai, Qi and Yao, Ting and Mei, Tao},
title = {Recurrent Tubelet Proposal and Recognition Networks for Action Detection},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Estimating Depth from RGB and Sparse Sensing
Chen, Zhao and Badrinarayanan, Vijay and Drozdov, Gilad and Rabinovich, Andrew
[pdf]
[bibtex]
@InProceedings{Chen_2018_ECCV,
author = {Chen, Zhao and Badrinarayanan, Vijay and Drozdov, Gilad and Rabinovich, Andrew},
title = {Estimating Depth from RGB and Sparse Sensing},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Folded Recurrent Neural Networks for Future Video Prediction
Oliu, Marc and Selva, Javier and Escalera, Sergio
[pdf]
[bibtex]
@InProceedings{Oliu_2018_ECCV,
author = {Oliu, Marc and Selva, Javier and Escalera, Sergio},
title = {Folded Recurrent Neural Networks for Future Video Prediction},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Holistic 3D Scene Parsing and Reconstruction from a Single RGB Image
Huang, Siyuan and Qi, Siyuan and Zhu, Yixin and Xiao, Yinxue and Xu, Yuanlu and Zhu, Song-Chun
[pdf]
[bibtex]
@InProceedings{Huang_2018_ECCV,
author = {Huang, Siyuan and Qi, Siyuan and Zhu, Yixin and Xiao, Yinxue and Xu, Yuanlu and Zhu, Song-Chun},
title = {Holistic 3D Scene Parsing and Reconstruction from a Single RGB Image},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Joint Task-Recursive Learning for Semantic Segmentation and Depth Estimation
Zhang, Zhenyu and Cui, Zhen and Xu, Chunyan and Jie, Zequn and Li, Xiang and Yang, Jian
[pdf]
[bibtex]
@InProceedings{Zhang_2018_ECCV,
author = {Zhang, Zhenyu and Cui, Zhen and Xu, Chunyan and Jie, Zequn and Li, Xiang and Yang, Jian},
title = {Joint Task-Recursive Learning for Semantic Segmentation and Depth Estimation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

A New Large Scale Dynamic Texture Dataset with Application to ConvNet Understanding
Hadji, Isma and Wildes, Richard P.
[pdf]
[bibtex]
@InProceedings{Hadji_2018_ECCV,
author = {Hadji, Isma and Wildes, Richard P.},
title = {A New Large Scale Dynamic Texture Dataset with Application to ConvNet Understanding},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Compositing-aware Image Search
Zhao, Hengshuang and Shen, Xiaohui and Lin, Zhe and Sunkavalli, Kalyan and Price, Brian and Jia, Jiaya
[pdf]
[bibtex]
@InProceedings{Zhao_2018_ECCV,
author = {Zhao, Hengshuang and Shen, Xiaohui and Lin, Zhe and Sunkavalli, Kalyan and Price, Brian and Jia, Jiaya},
title = {Compositing-aware Image Search},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Extreme Network Compression via Filter Group Approximation
Peng, Bo and Tan, Wenming and Li, Zheyang and Zhang, Shun and Xie, Di and Pu, Shiliang
[pdf]
[bibtex]
@InProceedings{Peng_2018_ECCV,
author = {Peng, Bo and Tan, Wenming and Li, Zheyang and Zhang, Shun and Xie, Di and Pu, Shiliang},
title = {Extreme Network Compression via Filter Group Approximation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Audio-Visual Scene Analysis with Self-Supervised Multisensory Features
Owens, Andrew and Efros, Alexei A.
[pdf]
[bibtex]
@InProceedings{Owens_2018_ECCV,
author = {Owens, Andrew and Efros, Alexei A.},
title = {Audio-Visual Scene Analysis with Self-Supervised Multisensory Features},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Look Before You Leap: Bridging Model-Free and Model-Based Reinforcement Learning for Planned-Ahead Vision-and-Language Navigation
Wang, Xin and Xiong, Wenhan and Wang, Hongmin and Yang Wang, William
[pdf]
[bibtex]
@InProceedings{Wang_2018_ECCV,
author = {Wang, Xin and Xiong, Wenhan and Wang, Hongmin and Yang Wang, William},
title = {Look Before You Leap: Bridging Model-Free and Model-Based Reinforcement Learning for Planned-Ahead Vision-and-Language Navigation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Structure-from-Motion-Aware PatchMatch for Adaptive Optical Flow Estimation
Maurer, Daniel and Marniok, Nico and Goldluecke, Bastian and Bruhn, Andres
[pdf]
[bibtex]
@InProceedings{Maurer_2018_ECCV,
author = {Maurer, Daniel and Marniok, Nico and Goldluecke, Bastian and Bruhn, Andres},
title = {Structure-from-Motion-Aware PatchMatch for Adaptive Optical Flow Estimation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design
Ma, Ningning and Zhang, Xiangyu and Zheng, Hai-Tao and Sun, Jian
[pdf]
[bibtex]
@InProceedings{Ma_2018_ECCV,
author = {Ma, Ningning and Zhang, Xiangyu and Zheng, Hai-Tao and Sun, Jian},
title = {ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Attention-GAN for Object Transfiguration in Wild Images
Chen, Xinyuan and Xu, Chang and Yang, Xiaokang and Tao, Dacheng
[pdf]
[bibtex]
@InProceedings{Chen_2018_ECCV,
author = {Chen, Xinyuan and Xu, Chang and Yang, Xiaokang and Tao, Dacheng},
title = {Attention-GAN for Object Transfiguration in Wild Images},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Joint Representation and Truncated Inference Learning for Correlation Filter based Tracking
Yao, Yingjie and Wu, Xiaohe and Zhang, Lei and Shan, Shiguang and Zuo, Wangmeng
[pdf]
[bibtex]
@InProceedings{Yao_2018_ECCV,
author = {Yao, Yingjie and Wu, Xiaohe and Zhang, Lei and Shan, Shiguang and Zuo, Wangmeng},
title = {Joint Representation and Truncated Inference Learning for Correlation Filter based Tracking},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Lambda Twist: An Accurate Fast Robust Perspective Three Point (P3P) Solver
Persson, Mikael and Nordberg, Klas
[pdf]
[bibtex]
@InProceedings{Persson_2018_ECCV,
author = {Persson, Mikael and Nordberg, Klas},
title = {Lambda Twist: An Accurate Fast Robust Perspective Three Point (P3P) Solver},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

StereoNet: Guided Hierarchical Refinement for Real-Time Edge-Aware Depth Prediction
Khamis, Sameh and Fanello, Sean and Rhemann, Christoph and Kowdle, Adarsh and Valentin, Julien and Izadi, Shahram
[pdf]
[bibtex]
@InProceedings{Khamis_2018_ECCV,
author = {Khamis, Sameh and Fanello, Sean and Rhemann, Christoph and Kowdle, Adarsh and Valentin, Julien and Izadi, Shahram},
title = {StereoNet: Guided Hierarchical Refinement for Real-Time Edge-Aware Depth Prediction},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Robust Optical Flow in Rainy Scenes
Li, Ruoteng and Tan, Robby T. and Cheong, Loong-Fah
[pdf]
[bibtex]
@InProceedings{Li_2018_ECCV,
author = {Li, Ruoteng and Tan, Robby T. and Cheong, Loong-Fah},
title = {Robust Optical Flow in Rainy Scenes},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Scale Aggregation Network for Accurate and Efficient Crowd Counting
Cao, Xinkun and Wang, Zhipeng and Zhao, Yanyun and Su, Fei
[pdf]
[bibtex]
@InProceedings{Cao_2018_ECCV,
author = {Cao, Xinkun and Wang, Zhipeng and Zhao, Yanyun and Su, Fei},
title = {Scale Aggregation Network for Accurate and Efficient Crowd Counting},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deep Feature Factorization For Concept Discovery
Collins, Edo and Achanta, Radhakrishna and Susstrunk, Sabine
[pdf]
[bibtex]
@InProceedings{Collins_2018_ECCV,
author = {Collins, Edo and Achanta, Radhakrishna and Susstrunk, Sabine},
title = {Deep Feature Factorization For Concept Discovery},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Object-centered image stitching
Herrmann, Charles and Wang, Chen and Strong Bowen, Richard and Keyder, Emil and Zabih, Ramin
[pdf]
[bibtex]
@InProceedings{Herrmann_2018_ECCV,
author = {Herrmann, Charles and Wang, Chen and Strong Bowen, Richard and Keyder, Emil and Zabih, Ramin},
title = {Object-centered image stitching},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

A Style-Aware Content Loss for Real-time HD Style Transfer
Sanakoyeu, Artsiom and Kotovenko, Dmytro and Lang, Sabine and Ommer, Bjorn
[pdf]
[bibtex]
@InProceedings{Sanakoyeu_2018_ECCV,
author = {Sanakoyeu, Artsiom and Kotovenko, Dmytro and Lang, Sabine and Ommer, Bjorn},
title = {A Style-Aware Content Loss for Real-time HD Style Transfer},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Recurrent Squeeze-and-Excitation Context Aggregation Net for Single Image Deraining
Li, Xia and Wu, Jianlong and Lin, Zhouchen and Liu, Hong and Zha, Hongbin
[pdf]
[bibtex]
@InProceedings{Li_2018_ECCV,
author = {Li, Xia and Wu, Jianlong and Lin, Zhouchen and Liu, Hong and Zha, Hongbin},
title = {Recurrent Squeeze-and-Excitation Context Aggregation Net for Single Image Deraining},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Acquisition of Localization Confidence for Accurate Object Detection
Jiang, Borui and Luo, Ruixuan and Mao, Jiayuan and Xiao, Tete and Jiang, Yuning
[pdf]
[bibtex]
@InProceedings{Jiang_2018_ECCV,
author = {Jiang, Borui and Luo, Ruixuan and Mao, Jiayuan and Xiao, Tete and Jiang, Yuning},
title = {Acquisition of Localization Confidence for Accurate Object Detection},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Joint 3D Face Reconstruction and Dense Alignment with Position Map Regression Network
Feng, Yao and Wu, Fan and Shao, Xiaohu and Wang, Yanfeng and Zhou, Xi
[pdf]
[bibtex]
@InProceedings{Feng_2018_ECCV,
author = {Feng, Yao and Wu, Fan and Shao, Xiaohu and Wang, Yanfeng and Zhou, Xi},
title = {Joint 3D Face Reconstruction and Dense Alignment with Position Map Regression Network},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Salient Objects in Clutter: Bringing Salient Object Detection to the Foreground
Fan, Deng-Ping and Cheng, Ming-Ming and Liu, Jiang-Jiang and Gao, Shang-Hua and Hou, Qibin and Borji, Ali
[pdf]
[bibtex]
@InProceedings{Fan_2018_ECCV,
author = {Fan, Deng-Ping and Cheng, Ming-Ming and Liu, Jiang-Jiang and Gao, Shang-Hua and Hou, Qibin and Borji, Ali},
title = {Salient Objects in Clutter: Bringing Salient Object Detection to the Foreground},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Multimodal Unsupervised Image-to-image Translation
Huang, Xun and Liu, Ming-Yu and Belongie, Serge and Kautz, Jan
[pdf]
[bibtex]
@InProceedings{Huang_2018_ECCV,
author = {Huang, Xun and Liu, Ming-Yu and Belongie, Serge and Kautz, Jan},
title = {Multimodal Unsupervised Image-to-image Translation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Diverse feature visualizations reveal invariances in early layers of deep neural networks
Cadena, Santiago A. and Weis, Marissa A. and Gatys, Leon A. and Bethge, Matthias and Ecker, Alexander S.
[pdf]
[bibtex]
@InProceedings{Cadena_2018_ECCV,
author = {Cadena, Santiago A. and Weis, Marissa A. and Gatys, Leon A. and Bethge, Matthias and Ecker, Alexander S.},
title = {Diverse feature visualizations reveal invariances in early layers of deep neural networks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

``Factual'' or ``Emotional'': Stylized Image Captioning with Adaptive Learning and Attention
Chen, Tianlang and Zhang, Zhongping and You, Quanzeng and Fang, Chen and Wang, Zhaowen and Jin, Hailin and Luo, Jiebo
[pdf]
[bibtex]
@InProceedings{Chen_2018_ECCV,
author = {Chen, Tianlang and Zhang, Zhongping and You, Quanzeng and Fang, Chen and Wang, Zhaowen and Jin, Hailin and Luo, Jiebo},
title = {``Factual'' or ``Emotional'': Stylized Image Captioning with Adaptive Learning and Attention},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deblurring Natural Image Using Super-Gaussian Fields
Liu, Yuhang and Dong, Wenyong and Gong, Dong and Zhang, Lei and Shi, Qinfeng
[pdf]
[bibtex]
@InProceedings{Liu_2018_ECCV,
author = {Liu, Yuhang and Dong, Wenyong and Gong, Dong and Zhang, Lei and Shi, Qinfeng},
title = {Deblurring Natural Image Using Super-Gaussian Fields},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Dense Semantic and Topological Correspondence of 3D Faces without Landmarks
Fan, Zhenfeng and Hu, Xiyuan and Chen, Chen and Peng, Silong
[pdf]
[bibtex]
@InProceedings{Fan_2018_ECCV,
author = {Fan, Zhenfeng and Hu, Xiyuan and Chen, Chen and Peng, Silong},
title = {Dense Semantic and Topological Correspondence of 3D Faces without Landmarks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

OmniDepth: Dense Depth Estimation for Indoors Spherical Panoramas
Zioulis, Nikolaos and Karakottas, Antonis and Zarpalas, Dimitrios and Daras, Petros
[pdf]
[bibtex]
@InProceedings{Zioulis_2018_ECCV,
author = {Zioulis, Nikolaos and Karakottas, Antonis and Zarpalas, Dimitrios and Daras, Petros},
title = {OmniDepth: Dense Depth Estimation for Indoors Spherical Panoramas},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

On Regularized Losses for Weakly-supervised CNN Segmentation
Tang, Meng and Perazzi, Federico and Djelouah, Abdelaziz and Ben Ayed, Ismail and Schroers, Christopher and Boykov, Yuri
[pdf]
[bibtex]
@InProceedings{Tang_2018_ECCV,
author = {Tang, Meng and Perazzi, Federico and Djelouah, Abdelaziz and Ben Ayed, Ismail and Schroers, Christopher and Boykov, Yuri},
title = {On Regularized Losses for Weakly-supervised CNN Segmentation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Learning Dynamic Memory Networks for Object Tracking
Yang, Tianyu and Chan, Antoni B.
[pdf]
[bibtex]
@InProceedings{Yang_2018_ECCV,
author = {Yang, Tianyu and Chan, Antoni B.},
title = {Learning Dynamic Memory Networks for Object Tracking},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Zero-Shot Deep Domain Adaptation
Peng, Kuan-Chuan and Wu, Ziyan and Ernst, Jan
[pdf]
[bibtex]
@InProceedings{Peng_2018_ECCV,
author = {Peng, Kuan-Chuan and Wu, Ziyan and Ernst, Jan},
title = {Zero-Shot Deep Domain Adaptation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

SphereNet: Learning Spherical Representations for Detection and Classification in Omnidirectional Images
Coors, Benjamin and Paul Condurache, Alexandru and Geiger, Andreas
[pdf]
[bibtex]
@InProceedings{Coors_2018_ECCV,
author = {Coors, Benjamin and Paul Condurache, Alexandru and Geiger, Andreas},
title = {SphereNet: Learning Spherical Representations for Detection and Classification in Omnidirectional Images},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Graininess-Aware Deep Feature Learning for Pedestrian Detection
Lin, Chunze and Lu, Jiwen and Wang, Gang and Zhou, Jie
[pdf]
[bibtex]
@InProceedings{Lin_2018_ECCV,
author = {Lin, Chunze and Lu, Jiwen and Wang, Gang and Zhou, Jie},
title = {Graininess-Aware Deep Feature Learning for Pedestrian Detection},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Learning to Forecast and Refine Residual Motion for Image-to-Video Generation
Zhao, Long and Peng, Xi and Tian, Yu and Kapadia, Mubbasir and Metaxas, Dimitris
[pdf]
[bibtex]
@InProceedings{Zhao_2018_ECCV,
author = {Zhao, Long and Peng, Xi and Tian, Yu and Kapadia, Mubbasir and Metaxas, Dimitris},
title = {Learning to Forecast and Refine Residual Motion for Image-to-Video Generation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

ML-LocNet: Improving Object Localization with Multi-view Learning Network
Zhang, Xiaopeng and Yang, Yang and Feng, Jiashi
[pdf]
[bibtex]
@InProceedings{Zhang_2018_ECCV,
author = {Zhang, Xiaopeng and Yang, Yang and Feng, Jiashi},
title = {ML-LocNet: Improving Object Localization with Multi-view Learning Network},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Statistically-motivated Second-order Pooling
Yu, Kaicheng and Salzmann, Mathieu
[pdf]
[bibtex]
@InProceedings{Yu_2018_ECCV,
author = {Yu, Kaicheng and Salzmann, Mathieu},
title = {Statistically-motivated Second-order Pooling},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Improving Generalization via Scalable Neighborhood Component Analysis
Wu, Zhirong and Efros, Alexei A. and Yu, Stella X.
[pdf]
[bibtex]
@InProceedings{Wu_2018_ECCV,
author = {Wu, Zhirong and Efros, Alexei A. and Yu, Stella X.},
title = {Improving Generalization via Scalable Neighborhood Component Analysis},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Monocular Depth Estimation with Affinity, Vertical Pooling, and Label Enhancement
Gan, Yukang and Xu, Xiangyu and Sun, Wenxiu and Lin, Liang
[pdf]
[bibtex]
@InProceedings{Gan_2018_ECCV,
author = {Gan, Yukang and Xu, Xiangyu and Sun, Wenxiu and Lin, Liang},
title = {Monocular Depth Estimation with Affinity, Vertical Pooling, and Label Enhancement},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Learning to Anonymize Faces for Privacy Preserving Action Detection
Ren, Zhongzheng and Jae Lee, Yong and Ryoo, Michael S.
[pdf]
[bibtex]
@InProceedings{Ren_2018_ECCV,
author = {Ren, Zhongzheng and Jae Lee, Yong and Ryoo, Michael S.},
title = {Learning to Anonymize Faces for Privacy Preserving Action Detection},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Distractor-aware Siamese Networks for Visual Object Tracking
Zhu, Zheng and Wang, Qiang and Li, Bo and Wu, Wei and Yan, Junjie and Hu, Weiming
[pdf]
[bibtex]
@InProceedings{Zhu_2018_ECCV,
author = {Zhu, Zheng and Wang, Qiang and Li, Bo and Wu, Wei and Yan, Junjie and Hu, Weiming},
title = {Distractor-aware Siamese Networks for Visual Object Tracking},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Question Type Guided Attention in Visual Question Answering
Shi, Yang and Furlanello, Tommaso and Zha, Sheng and Anandkumar, Animashree
[pdf]
[bibtex]
@InProceedings{Shi_2018_ECCV,
author = {Shi, Yang and Furlanello, Tommaso and Zha, Sheng and Anandkumar, Animashree},
title = {Question Type Guided Attention in Visual Question Answering},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Escaping from Collapsing Modes in a Constrained Space
Chang, Chia-Che and Hubert Lin, Chieh and Lee, Che-Rung and Juan, Da-Cheng and Wei, Wei and Chen, Hwann-Tzong
[pdf]
[bibtex]
@InProceedings{Chang_2018_ECCV,
author = {Chang, Chia-Che and Hubert Lin, Chieh and Lee, Che-Rung and Juan, Da-Cheng and Wei, Wei and Chen, Hwann-Tzong},
title = {Escaping from Collapsing Modes in a Constrained Space},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Light Structure from Pin Motion: Simple and Accurate Point Light Calibration for Physics-based Modeling
Santo, Hiroaki and Waechter, Michael and Samejima, Masaki and Sugano, Yusuke and Matsushita, Yasuyuki
[pdf]
[bibtex]
@InProceedings{Santo_2018_ECCV,
author = {Santo, Hiroaki and Waechter, Michael and Samejima, Masaki and Sugano, Yusuke and Matsushita, Yasuyuki},
title = {Light Structure from Pin Motion: Simple and Accurate Point Light Calibration for Physics-based Modeling},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Bayesian Semantic Instance Segmentation in Open Set World
Pham, Trung and Kumar, Vijay B. G. and Do, Thanh-Toan and Carneiro, Gustavo and Reid, Ian
[pdf]
[bibtex]
@InProceedings{Pham_2018_ECCV,
author = {Pham, Trung and Kumar, Vijay B. G. and Do, Thanh-Toan and Carneiro, Gustavo and Reid, Ian},
title = {Bayesian Semantic Instance Segmentation in Open Set World},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

HybridNet: Classification and Reconstruction Cooperation for Semi-Supervised Learning
Robert, Thomas and Thome, Nicolas and Cord, Matthieu
[pdf]
[bibtex]
@InProceedings{Robert_2018_ECCV,
author = {Robert, Thomas and Thome, Nicolas and Cord, Matthieu},
title = {HybridNet: Classification and Reconstruction Cooperation for Semi-Supervised Learning},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Uncertainty Estimates and Multi-Hypotheses Networks for Optical Flow
Ilg, Eddy and Cicek, Ozgun and Galesso, Silvio and Klein, Aaron and Makansi, Osama and Hutter, Frank and Brox, Thomas
[pdf]
[bibtex]
@InProceedings{Ilg_2018_ECCV,
author = {Ilg, Eddy and Cicek, Ozgun and Galesso, Silvio and Klein, Aaron and Makansi, Osama and Hutter, Frank and Brox, Thomas},
title = {Uncertainty Estimates and Multi-Hypotheses Networks for Optical Flow},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Discriminative Region Proposal Adversarial Networks for High-Quality Image-to-Image Translation
Wang, Chao and Zheng, Haiyong and Yu, Zhibin and Zheng, Ziqiang and Gu, Zhaorui and Zheng, Bing
[pdf]
[bibtex]
@InProceedings{Wang_2018_ECCV,
author = {Wang, Chao and Zheng, Haiyong and Yu, Zhibin and Zheng, Ziqiang and Gu, Zhaorui and Zheng, Bing},
title = {Discriminative Region Proposal Adversarial Networks for High-Quality Image-to-Image Translation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Transductive Semi-Supervised Deep Learning using Min-Max Features
Shi, Weiwei and Gong, Yihong and Ding, Chris and MaXiaoyu Tao, Zhiheng and Zheng, Nanning
[pdf]
[bibtex]
@InProceedings{Shi_2018_ECCV,
author = {Shi, Weiwei and Gong, Yihong and Ding, Chris and MaXiaoyu Tao, Zhiheng and Zheng, Nanning},
title = {Transductive Semi-Supervised Deep Learning using Min-Max Features},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Interpolating Convolutional Neural Networks Using Batch Normalization
Wesley Putra Data, Gratianus and Ngu, Kirjon and William Murray, David and Adrian Prisacariu, Victor
[pdf]
[bibtex]
@InProceedings{Data_2018_ECCV,
author = {Wesley Putra Data, Gratianus and Ngu, Kirjon and William Murray, David and Adrian Prisacariu, Victor},
title = {Interpolating Convolutional Neural Networks Using Batch Normalization},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Learning Blind Video Temporal Consistency
Lai, Wei-Sheng and Huang, Jia-Bin and Wang, Oliver and Shechtman, Eli and Yumer, Ersin and Yang, Ming-Hsuan
[pdf]
[bibtex]
@InProceedings{Lai_2018_ECCV,
author = {Lai, Wei-Sheng and Huang, Jia-Bin and Wang, Oliver and Shechtman, Eli and Yumer, Ersin and Yang, Ming-Hsuan},
title = {Learning Blind Video Temporal Consistency},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Learning Class Prototypes via Structure Alignment for Zero-Shot Recognition
Jiang, Huajie and Wang, Ruiping and Shan, Shiguang and Chen, Xilin
[pdf]
[bibtex]
@InProceedings{Jiang_2018_ECCV,
author = {Jiang, Huajie and Wang, Ruiping and Shan, Shiguang and Chen, Xilin},
title = {Learning Class Prototypes via Structure Alignment for Zero-Shot Recognition},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Fine-grained Video Categorization with Redundancy Reduction Attention
Zhu, Chen and Tan, Xiao and Zhou, Feng and Liu, Xiao and Yue, Kaiyu and Ding, Errui and Ma, Yi
[pdf]
[bibtex]
@InProceedings{Zhu_2018_ECCV,
author = {Zhu, Chen and Tan, Xiao and Zhou, Feng and Liu, Xiao and Yue, Kaiyu and Ding, Errui and Ma, Yi},
title = {Fine-grained Video Categorization with Redundancy Reduction Attention},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Object Detection in Video with Spatiotemporal Sampling Networks
Bertasius, Gedas and Torresani, Lorenzo and Shi, Jianbo
[pdf]
[bibtex]
@InProceedings{Bertasius_2018_ECCV,
author = {Bertasius, Gedas and Torresani, Lorenzo and Shi, Jianbo},
title = {Object Detection in Video with Spatiotemporal Sampling Networks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Graph Distillation for Action Detection with Privileged Modalities
Luo, Zelun and Hsieh, Jun-Ting and Jiang, Lu and Niebles, Juan Carlos and Fei-Fei, Li
[pdf]
[bibtex]
@InProceedings{Luo_2018_ECCV,
author = {Luo, Zelun and Hsieh, Jun-Ting and Jiang, Lu and Niebles, Juan Carlos and Fei-Fei, Li},
title = {Graph Distillation for Action Detection with Privileged Modalities},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Efficient Uncertainty Estimation for Semantic Segmentation in Videos
Huang, Po-Yu and Hsu, Wan-Ting and Chiu, Chun-Yueh and Wu, Ting-Fan and Sun, Min
[pdf]
[bibtex]
@InProceedings{Huang_2018_ECCV,
author = {Huang, Po-Yu and Hsu, Wan-Ting and Chiu, Chun-Yueh and Wu, Ting-Fan and Sun, Min},
title = {Efficient Uncertainty Estimation for Semantic Segmentation in Videos},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Saliency Preservation in Low-Resolution Grayscale Images
Yohanandan, Shivanthan and Song, Andy and Dyer, Adrian G. and Tao, Dacheng
[pdf]
[bibtex]
@InProceedings{Yohanandan_2018_ECCV,
author = {Yohanandan, Shivanthan and Song, Andy and Dyer, Adrian G. and Tao, Dacheng},
title = {Saliency Preservation in Low-Resolution Grayscale Images},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Polarimetric Three-View Geometry
Chen, Lixiong and Zheng, Yinqiang and Subpa-asa, Art and Sato, Imari
[pdf]
[bibtex]
@InProceedings{Chen_2018_ECCV,
author = {Chen, Lixiong and Zheng, Yinqiang and Subpa-asa, Art and Sato, Imari},
title = {Polarimetric Three-View Geometry},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deep Imbalanced Attribute Classification using Visual Attention Aggregation
Sarafianos, Nikolaos and Xu, Xiang and Kakadiaris, Ioannis A.
[pdf]
[bibtex]
@InProceedings{Sarafianos_2018_ECCV,
author = {Sarafianos, Nikolaos and Xu, Xiang and Kakadiaris, Ioannis A.},
title = {Deep Imbalanced Attribute Classification using Visual Attention Aggregation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Adding Attentiveness to the Neurons in Recurrent Neural Networks
Zhang, Pengfei and Xue, Jianru and Lan, Cuiling and Zeng, Wenjun and Gao, Zhanning and Zheng, Nanning
[pdf]
[bibtex]
@InProceedings{Zhang_2018_ECCV,
author = {Zhang, Pengfei and Xue, Jianru and Lan, Cuiling and Zeng, Wenjun and Gao, Zhanning and Zheng, Nanning},
title = {Adding Attentiveness to the Neurons in Recurrent Neural Networks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Seeing Deeply and Bidirectionally: A Deep Learning Approach for Single Image Reflection Removal
Yang, Jie and Gong, Dong and Liu, Lingqiao and Shi, Qinfeng
[pdf]
[bibtex]
@InProceedings{Yang_2018_ECCV,
author = {Yang, Jie and Gong, Dong and Liu, Lingqiao and Shi, Qinfeng},
title = {Seeing Deeply and Bidirectionally: A Deep Learning Approach for Single Image Reflection Removal},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Fast and Accurate Camera Covariance Computation for Large 3D Reconstruction
Polic, Michal and Forstner, Wolfgang and Pajdla, Tomas
[pdf]
[bibtex]
@InProceedings{Polic_2018_ECCV,
author = {Polic, Michal and Forstner, Wolfgang and Pajdla, Tomas},
title = {Fast and Accurate Camera Covariance Computation for Large 3D Reconstruction},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Dynamic Multimodal Instance Segmentation Guided by Natural Language Queries
Margffoy-Tuay, Edgar and Perez, Juan C. and Botero, Emilio and Arbelaez, Pablo
[pdf]
[bibtex]
@InProceedings{Margffoy-Tuay_2018_ECCV,
author = {Margffoy-Tuay, Edgar and Perez, Juan C. and Botero, Emilio and Arbelaez, Pablo},
title = {Dynamic Multimodal Instance Segmentation Guided by Natural Language Queries},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Learning SO(3) Equivariant Representations with Spherical CNNs
Esteves, Carlos and Allen-Blanchette, Christine and Makadia, Ameesh and Daniilidis, Kostas
[pdf]
[bibtex]
@InProceedings{Esteves_2018_ECCV,
author = {Esteves, Carlos and Allen-Blanchette, Christine and Makadia, Ameesh and Daniilidis, Kostas},
title = {Learning SO(3) Equivariant Representations with Spherical CNNs},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Out-of-Distribution Detection Using an Ensemble of Self Supervised Leave-out Classifiers
Vyas, Apoorv and Jammalamadaka, Nataraj and Zhu, Xia and Das, Dipankar and Kaul, Bharat and Willke, Theodore L.
[pdf]
[bibtex]
@InProceedings{Vyas_2018_ECCV,
author = {Vyas, Apoorv and Jammalamadaka, Nataraj and Zhu, Xia and Das, Dipankar and Kaul, Bharat and Willke, Theodore L.},
title = {Out-of-Distribution Detection Using an Ensemble of Self Supervised Leave-out Classifiers},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Interaction-aware Spatio-temporal Pyramid Attention Networks for Action Classification
Du, Yang and Yuan, Chunfeng and Li, Bing and Zhao, Lili and Li, Yangxi and Hu, Weiming
[pdf]
[bibtex]
@InProceedings{Du_2018_ECCV,
author = {Du, Yang and Yuan, Chunfeng and Li, Bing and Zhao, Lili and Li, Yangxi and Hu, Weiming},
title = {Interaction-aware Spatio-temporal Pyramid Attention Networks for Action Classification},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

T2Net: Synthetic-to-Realistic Translation for Solving Single-Image Depth Estimation Tasks
Zheng, Chuanxia and Cham, Tat-Jen and Cai, Jianfei
[pdf]
[bibtex]
@InProceedings{Zheng_2018_ECCV,
author = {Zheng, Chuanxia and Cham, Tat-Jen and Cai, Jianfei},
title = {T2Net: Synthetic-to-Realistic Translation for Solving Single-Image Depth Estimation Tasks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Video Object Detection with an Aligned Spatial-Temporal Memory
Xiao, Fanyi and Jae Lee, Yong
[pdf]
[bibtex]
@InProceedings{Xiao_2018_ECCV,
author = {Xiao, Fanyi and Jae Lee, Yong},
title = {Video Object Detection with an Aligned Spatial-Temporal Memory},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

CGIntrinsics: Better Intrinsic Image Decomposition through Physically-Based Rendering
Li, Zhengqi and Snavely, Noah
[pdf]
[bibtex]
@InProceedings{Li_2018_ECCV,
author = {Li, Zhengqi and Snavely, Noah},
title = {CGIntrinsics: Better Intrinsic Image Decomposition through Physically-Based Rendering},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Partial Adversarial Domain Adaptation
Cao, Zhangjie and Ma, Lijia and Long, Mingsheng and Wang, Jianmin
[pdf]
[bibtex]
@InProceedings{Cao_2018_ECCV,
author = {Cao, Zhangjie and Ma, Lijia and Long, Mingsheng and Wang, Jianmin},
title = {Partial Adversarial Domain Adaptation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Diverse and Coherent Paragraph Generation from Images
Chatterjee, Moitreya and Schwing, Alexander G.
[pdf]
[bibtex]
@InProceedings{Chatterjee_2018_ECCV,
author = {Chatterjee, Moitreya and Schwing, Alexander G.},
title = {Diverse and Coherent Paragraph Generation from Images},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Diverse Image-to-Image Translation via Disentangled Representations
Lee, Hsin-Ying and Tseng, Hung-Yu and Huang, Jia-Bin and Singh, Maneesh and Yang, Ming-Hsuan
[pdf]
[bibtex]
@InProceedings{Lee_2018_ECCV,
author = {Lee, Hsin-Ying and Tseng, Hung-Yu and Huang, Jia-Bin and Singh, Maneesh and Yang, Ming-Hsuan},
title = {Diverse Image-to-Image Translation via Disentangled Representations},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

BOP: Benchmark for 6D Object Pose Estimation
Hodan, Tomas and Michel, Frank and Brachmann, Eric and Kehl, Wadim and GlentBuch, Anders and Kraft, Dirk and Drost, Bertram and Vidal, Joel and Ihrke, Stephan and Zabulis, Xenophon and Sahin, Caner and Manhardt, Fabian and Tombari, Federico and Kim, Tae-Kyun and Matas, Jiri and Rother, Carsten
[pdf]
[bibtex]
@InProceedings{Hodan_2018_ECCV,
author = {Hodan, Tomas and Michel, Frank and Brachmann, Eric and Kehl, Wadim and GlentBuch, Anders and Kraft, Dirk and Drost, Bertram and Vidal, Joel and Ihrke, Stephan and Zabulis, Xenophon and Sahin, Caner and Manhardt, Fabian and Tombari, Federico and Kim, Tae-Kyun and Matas, Jiri and Rother, Carsten},
title = {BOP: Benchmark for 6D Object Pose Estimation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Generative Domain-Migration Hashing for Sketch-to-Image Retrieval
Zhang, Jingyi and Shen, Fumin and Liu, Li and Zhu, Fan and Yu, Mengyang and Shao, Ling and Tao Shen, Heng and Van Gool, Luc
[pdf]
[bibtex]
@InProceedings{Zhang_2018_ECCV,
author = {Zhang, Jingyi and Shen, Fumin and Liu, Li and Zhu, Fan and Yu, Mengyang and Shao, Ling and Tao Shen, Heng and Van Gool, Luc},
title = {Generative Domain-Migration Hashing for Sketch-to-Image Retrieval},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Multimodal image alignment through a multiscale chain of neural networks with application to remote sensing
Zampieri, Armand and Charpiat, Guillaume and Girard, Nicolas and Tarabalka, Yuliya
[pdf]
[bibtex]
@InProceedings{Zampieri_2018_ECCV,
author = {Zampieri, Armand and Charpiat, Guillaume and Girard, Nicolas and Tarabalka, Yuliya},
title = {Multimodal image alignment through a multiscale chain of neural networks with application to remote sensing},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

FloorNet: A Unified Framework for Floorplan Reconstruction from 3D Scans
Liu, Chen and Wu, Jiaye and Furukawa, Yasutaka
[pdf]
[bibtex]
@InProceedings{Liu_2018_ECCV,
author = {Liu, Chen and Wu, Jiaye and Furukawa, Yasutaka},
title = {FloorNet: A Unified Framework for Floorplan Reconstruction from 3D Scans},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Unsupervised Hard Example Mining from Videos for Improved Object Detection
Jin, SouYoung and RoyChowdhury, Aruni and Jiang, Huaizu and Singh, Ashish and Prasad, Aditya and Chakraborty, Deep and Learned-Miller, Erik
[pdf]
[bibtex]
@InProceedings{Jin_2018_ECCV,
author = {Jin, SouYoung and RoyChowdhury, Aruni and Jiang, Huaizu and Singh, Ashish and Prasad, Aditya and Chakraborty, Deep and Learned-Miller, Erik},
title = {Unsupervised Hard Example Mining from Videos for Improved Object Detection},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

A Deeply-initialized Coarse-to-fine Ensemble of Regression Trees for Face Alignment
Valle, Roberto and Buenaposada, Jose M. and Valdes, Antonio and Baumela, Luis
[pdf]
[bibtex]
@InProceedings{Valle_2018_ECCV,
author = {Valle, Roberto and Buenaposada, Jose M. and Valdes, Antonio and Baumela, Luis},
title = {A Deeply-initialized Coarse-to-fine Ensemble of Regression Trees for Face Alignment},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Transferring GANs: generating images from limited data
Wang, Yaxing and Wu, Chenshen and Herranz, Luis and van de Weijer, Joost and Gonzalez-Garcia, Abel and Raducanu, Bogdan
[pdf]
[bibtex]
@InProceedings{Wang_2018_ECCV,
author = {Wang, Yaxing and Wu, Chenshen and Herranz, Luis and van de Weijer, Joost and Gonzalez-Garcia, Abel and Raducanu, Bogdan},
title = {Transferring GANs: generating images from limited data},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Cross-Modal Ranking with Soft Consistency and Noisy Labels for Robust RGB-T Tracking
Li, Chenglong and Zhu, Chengli and Huang, Yan and Tang, Jin and Wang, Liang
[pdf]
[bibtex]
@InProceedings{Li_2018_ECCV,
author = {Li, Chenglong and Zhu, Chengli and Huang, Yan and Tang, Jin and Wang, Liang},
title = {Cross-Modal Ranking with Soft Consistency and Noisy Labels for Robust RGB-T Tracking},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Broadcasting Convolutional Network for Visual Relational Reasoning
Chang, Simyung and Yang, John and Park, SeongUk and Kwak, Nojun
[pdf]
[bibtex]
@InProceedings{Chang_2018_ECCV,
author = {Chang, Simyung and Yang, John and Park, SeongUk and Kwak, Nojun},
title = {Broadcasting Convolutional Network for Visual Relational Reasoning},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

DF-Net: Unsupervised Joint Learning of Depth and Flow using Cross-Task Consistency
Zou, Yuliang and Luo, Zelun and Huang, Jia-Bin
[pdf]
[bibtex]
@InProceedings{Zou_2018_ECCV,
author = {Zou, Yuliang and Luo, Zelun and Huang, Jia-Bin},
title = {DF-Net: Unsupervised Joint Learning of Depth and Flow using Cross-Task Consistency},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

K-convexity shape priors for segmentation
Isack, Hossam and Gorelick, Lena and Ng, Karin and Veksler, Olga and Boykov, Yuri
[pdf]
[bibtex]
@InProceedings{Isack_2018_ECCV,
author = {Isack, Hossam and Gorelick, Lena and Ng, Karin and Veksler, Olga and Boykov, Yuri},
title = {K-convexity shape priors for segmentation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Fine-Grained Visual Categorization using Meta-Learning Optimization with Sample Selection of Auxiliary Data
Zhang, Yabin and Tang, Hui and Jia, Kui
[pdf]
[bibtex]
@InProceedings{Zhang_2018_ECCV,
author = {Zhang, Yabin and Tang, Hui and Jia, Kui},
title = {Fine-Grained Visual Categorization using Meta-Learning Optimization with Sample Selection of Auxiliary Data},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Hierarchical Bilinear Pooling for Fine-Grained Visual Recognition
Yu, Chaojian and Zhao, Xinyi and Zheng, Qi and Zhang, Peng and You, Xinge
[pdf]
[bibtex]
@InProceedings{Yu_2018_ECCV,
author = {Yu, Chaojian and Zhao, Xinyi and Zheng, Qi and Zhang, Peng and You, Xinge},
title = {Hierarchical Bilinear Pooling for Fine-Grained Visual Recognition},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Unpaired Image Captioning by Language Pivoting
Gu, Jiuxiang and Joty, Shafiq and Cai, Jianfei and Wang, Gang
[pdf]
[bibtex]
@InProceedings{Gu_2018_ECCV,
author = {Gu, Jiuxiang and Joty, Shafiq and Cai, Jianfei and Wang, Gang},
title = {Unpaired Image Captioning by Language Pivoting},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Face De-Spoofing: Anti-Spoofing via Noise Modeling
Jourabloo, Amin and Liu, Yaojie and Liu, Xiaoming
[pdf]
[bibtex]
@InProceedings{Jourabloo_2018_ECCV,
author = {Jourabloo, Amin and Liu, Yaojie and Liu, Xiaoming},
title = {Face De-Spoofing: Anti-Spoofing via Noise Modeling},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Unsupervised Geometry-Aware Representation for 3D Human Pose Estimation
Rhodin, Helge and Salzmann, Mathieu and Fua, Pascal
[pdf]
[bibtex]
@InProceedings{Rhodin_2018_ECCV,
author = {Rhodin, Helge and Salzmann, Mathieu and Fua, Pascal},
title = {Unsupervised Geometry-Aware Representation for 3D Human Pose Estimation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Comparator Networks
Xie, Weidi and Shen, Li and Zisserman, Andrew
[pdf]
[bibtex]
@InProceedings{Xie_2018_ECCV,
author = {Xie, Weidi and Shen, Li and Zisserman, Andrew},
title = {Comparator Networks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Quaternion Convolutional Neural Networks
Zhu, Xuanyu and Xu, Yi and Xu, Hongteng and Chen, Changjian
[pdf]
[bibtex]
@InProceedings{Zhu_2018_ECCV,
author = {Zhu, Xuanyu and Xu, Yi and Xu, Hongteng and Chen, Changjian},
title = {Quaternion Convolutional Neural Networks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Learning Priors for Semantic 3D Reconstruction
Cherabier, Ian and Schonberger, Johannes L. and Oswald, Martin R. and Pollefeys, Marc and Geiger, Andreas
[pdf]
[bibtex]
@InProceedings{Cherabier_2018_ECCV,
author = {Cherabier, Ian and Schonberger, Johannes L. and Oswald, Martin R. and Pollefeys, Marc and Geiger, Andreas},
title = {Learning Priors for Semantic 3D Reconstruction},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Joint Map and Symmetry Synchronization
Sun, Yifan and Liang, Zhenxiao and Huang, Xiangru and Huang, Qixing
[pdf]
[bibtex]
@InProceedings{Sun_2018_ECCV,
author = {Sun, Yifan and Liang, Zhenxiao and Huang, Xiangru and Huang, Qixing},
title = {Joint Map and Symmetry Synchronization},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Start, Follow, Read: End-to-End Full-Page Handwriting Recognition
Wigington, Curtis and Tensmeyer, Chris and Davis, Brian and Barrett, William and Price, Brian and Cohen, Scott
[pdf]
[bibtex]
@InProceedings{Wigington_2018_ECCV,
author = {Wigington, Curtis and Tensmeyer, Chris and Davis, Brian and Barrett, William and Price, Brian and Cohen, Scott},
title = {Start, Follow, Read: End-to-End Full-Page Handwriting Recognition},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Reverse Attention for Salient Object Detection
Chen, Shuhan and Tan, Xiuli and Wang, Ben and Hu, Xuelong
[pdf]
[bibtex]
@InProceedings{Chen_2018_ECCV,
author = {Chen, Shuhan and Tan, Xiuli and Wang, Ben and Hu, Xuelong},
title = {Reverse Attention for Salient Object Detection},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes
Long, Shangbang and Ruan, Jiaqiang and Zhang, Wenjie and He, Xin and Wu, Wenhao and Yao, Cong
[pdf]
[bibtex]
@InProceedings{Long_2018_ECCV,
author = {Long, Shangbang and Ruan, Jiaqiang and Zhang, Wenjie and He, Xin and Wu, Wenhao and Yao, Cong},
title = {TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Linear Span Network for Object Skeleton Detection
Liu, Chang and Ke, Wei and Qin, Fei and Ye, Qixiang
[pdf]
[bibtex]
@InProceedings{Liu_2018_ECCV,
author = {Liu, Chang and Ke, Wei and Qin, Fei and Ye, Qixiang},
title = {Linear Span Network for Object Skeleton Detection},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Efficient Relative Attribute Learning using Graph Neural Networks
Meng, Zihang and Adluru, Nagesh and Kim, Hyunwoo J. and Fung, Glenn and Singh, Vikas
[pdf]
[bibtex]
@InProceedings{Meng_2018_ECCV,
author = {Meng, Zihang and Adluru, Nagesh and Kim, Hyunwoo J. and Fung, Glenn and Singh, Vikas},
title = {Efficient Relative Attribute Learning using Graph Neural Networks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Model-free Consensus Maximization for Non-Rigid Shapes
Probst, Thomas and Chhatkuli, Ajad and Pani Paudel, Danda and Van Gool, Luc
[pdf]
[bibtex]
@InProceedings{Probst_2018_ECCV,
author = {Probst, Thomas and Chhatkuli, Ajad and Pani Paudel, Danda and Van Gool, Luc},
title = {Model-free Consensus Maximization for Non-Rigid Shapes},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

U-PC: Unsupervised Planogram Compliance
Ray, Archan and Kumar, Nishant and Shaw, Avishek and Prasad Mukherjee, Dipti
[pdf]
[bibtex]
@InProceedings{Ray_2018_ECCV,
author = {Ray, Archan and Kumar, Nishant and Shaw, Avishek and Prasad Mukherjee, Dipti},
title = {U-PC: Unsupervised Planogram Compliance},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Predicting Future Instance Segmentation by Forecasting Convolutional Features
Luc, Pauline and Couprie, Camille and LeCun, Yann and Verbeek, Jakob
[pdf]
[bibtex]
@InProceedings{Luc_2018_ECCV,
author = {Luc, Pauline and Couprie, Camille and LeCun, Yann and Verbeek, Jakob},
title = {Predicting Future Instance Segmentation by Forecasting Convolutional Features},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Person Search by Multi-Scale Matching
Lan, Xu and Zhu, Xiatian and Gong, Shaogang
[pdf]
[bibtex]
@InProceedings{Lan_2018_ECCV,
author = {Lan, Xu and Zhu, Xiatian and Gong, Shaogang},
title = {Person Search by Multi-Scale Matching},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Flow-Grounded Spatial-Temporal Video Prediction from Still Images
Li, Yijun and Fang, Chen and Yang, Jimei and Wang, Zhaowen and Lu, Xin and Yang, Ming-Hsuan
[pdf]
[bibtex]
@InProceedings{Li_2018_ECCV,
author = {Li, Yijun and Fang, Chen and Yang, Jimei and Wang, Zhaowen and Lu, Xin and Yang, Ming-Hsuan},
title = {Flow-Grounded Spatial-Temporal Video Prediction from Still Images},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Liquid Pouring Monitoring via Rich Sensory Inputs
Wu, Tz-Ying and Lin, Juan-Ting and Wang, Tsun-Hsuang and Hu, Chan-Wei and Niebles, Juan Carlos and Sun, Min
[pdf]
[bibtex]
@InProceedings{Wu_2018_ECCV,
author = {Wu, Tz-Ying and Lin, Juan-Ting and Wang, Tsun-Hsuang and Hu, Chan-Wei and Niebles, Juan Carlos and Sun, Min},
title = {Liquid Pouring Monitoring via Rich Sensory Inputs},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Exploiting temporal information for 3D human pose estimation
Rayat Imtiaz Hossain, Mir and Little, James J.
[pdf]
[bibtex]
@InProceedings{Hossain_2018_ECCV,
author = {Rayat Imtiaz Hossain, Mir and Little, James J.},
title = {Exploiting temporal information for 3D human pose estimation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Unsupervised CNN-based Co-Saliency Detection with Graphical Optimization
Hsu, Kuang-Jui and Tsai, Chung-Chi and Lin, Yen-Yu and Qian, Xiaoning and Chuang, Yung-Yu
[pdf]
[bibtex]
@InProceedings{Hsu_2018_ECCV,
author = {Hsu, Kuang-Jui and Tsai, Chung-Chi and Lin, Yen-Yu and Qian, Xiaoning and Chuang, Yung-Yu},
title = {Unsupervised CNN-based Co-Saliency Detection with Graphical Optimization},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Localization Recall Precision (LRP): A New Performance Metric for Object Detection
Oksuz, Kemal and Can Cam, Baris and Akbas, Emre and Kalkan, Sinan
[pdf]
[bibtex]
@InProceedings{Oksuz_2018_ECCV,
author = {Oksuz, Kemal and Can Cam, Baris and Akbas, Emre and Kalkan, Sinan},
title = {Localization Recall Precision (LRP): A New Performance Metric for Object Detection},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Attentive Semantic Alignment with Offset-Aware Correlation Kernels
Hongsuck Seo, Paul and Lee, Jongmin and Jung, Deunsol and Han, Bohyung and Cho, Minsu
[pdf]
[bibtex]
@InProceedings{Seo_2018_ECCV,
author = {Hongsuck Seo, Paul and Lee, Jongmin and Jung, Deunsol and Han, Bohyung and Cho, Minsu},
title = {Attentive Semantic Alignment with Offset-Aware Correlation Kernels},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Learning 3D Human Pose from Structure and Motion
Dabral, Rishabh and Mundhada, Anurag and Kusupati, Uday and Afaque, Safeer and Sharma, Abhishek and Jain, Arjun
[pdf]
[bibtex]
@InProceedings{Dabral_2018_ECCV,
author = {Dabral, Rishabh and Mundhada, Anurag and Kusupati, Uday and Afaque, Safeer and Sharma, Abhishek and Jain, Arjun},
title = {Learning 3D Human Pose from Structure and Motion},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

ForestHash: Semantic Hashing With Shallow Random Forests and Tiny Convolutional Networks
Qiu, Qiang and Lezama, Jose and Bronstein, Alex and Sapiro, Guillermo
[pdf]
[bibtex]
@InProceedings{Qiu_2018_ECCV,
author = {Qiu, Qiang and Lezama, Jose and Bronstein, Alex and Sapiro, Guillermo},
title = {ForestHash: Semantic Hashing With Shallow Random Forests and Tiny Convolutional Networks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Online Detection of Action Start in Untrimmed, Streaming Videos
Shou, Zheng and Pan, Junting and Chan, Jonathan and Miyazawa, Kazuyuki and Mansour, Hassan and Vetro, Anthony and Giro-i-Nieto, Xavier and Chang, Shih-Fu
[pdf]
[bibtex]
@InProceedings{Shou_2018_ECCV,
author = {Shou, Zheng and Pan, Junting and Chan, Jonathan and Miyazawa, Kazuyuki and Mansour, Hassan and Vetro, Anthony and Giro-i-Nieto, Xavier and Chang, Shih-Fu},
title = {Online Detection of Action Start in Untrimmed, Streaming Videos},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Exploring the Limits of Weakly Supervised Pretraining
Mahajan, Dhruv and Girshick, Ross and Ramanathan, Vignesh and He, Kaiming and Paluri, Manohar and Li, Yixuan and Bharambe, Ashwin and van der Maaten, Laurens
[pdf]
[bibtex]
@InProceedings{Mahajan_2018_ECCV,
author = {Mahajan, Dhruv and Girshick, Ross and Ramanathan, Vignesh and He, Kaiming and Paluri, Manohar and Li, Yixuan and Bharambe, Ashwin and van der Maaten, Laurens},
title = {Exploring the Limits of Weakly Supervised Pretraining},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Revisiting RCNN: On Awakening the Classification Power of Faster RCNN
Cheng, Bowen and Wei, Yunchao and Shi, Honghui and Feris, Rogerio and Xiong, Jinjun and Huang, Thomas
[pdf]
[bibtex]
@InProceedings{Cheng_2018_ECCV,
author = {Cheng, Bowen and Wei, Yunchao and Shi, Honghui and Feris, Rogerio and Xiong, Jinjun and Huang, Thomas},
title = {Revisiting RCNN: On Awakening the Classification Power of Faster RCNN},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

HandMap: Robust Hand Pose Estimation via Intermediate Dense Guidance Map Supervision
Wu, Xiaokun and Finnegan, Daniel and O'Neill, Eamonn and Yang, Yong-Liang
[pdf]
[bibtex]
@InProceedings{Wu_2018_ECCV,
author = {Wu, Xiaokun and Finnegan, Daniel and O'Neill, Eamonn and Yang, Yong-Liang},
title = {HandMap: Robust Hand Pose Estimation via Intermediate Dense Guidance Map Supervision},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Unsupervised Learning of Multi-Frame Optical Flow with Occlusions
Janai, Joel and Guney, Fatma and Ranjan, Anurag and Black, Michael and Geiger, Andreas
[pdf]
[bibtex]
@InProceedings{Janai_2018_ECCV,
author = {Janai, Joel and Guney, Fatma and Ranjan, Anurag and Black, Michael and Geiger, Andreas},
title = {Unsupervised Learning of Multi-Frame Optical Flow with Occlusions},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Integrating Egocentric Videos in Top-view Surveillance Videos: Joint Identification and Temporal Alignment
Ardeshir, Shervin and Borji, Ali
[pdf]
[bibtex]
@InProceedings{Ardeshir_2018_ECCV,
author = {Ardeshir, Shervin and Borji, Ali},
title = {Integrating Egocentric Videos in Top-view Surveillance Videos: Joint Identification and Temporal Alignment},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Attribute-Guided Face Generation Using Conditional CycleGAN
Lu, Yongyi and Tai, Yu-Wing and Tang, Chi-Keung
[pdf]
[bibtex]
@InProceedings{Lu_2018_ECCV,
author = {Lu, Yongyi and Tai, Yu-Wing and Tang, Chi-Keung},
title = {Attribute-Guided Face Generation Using Conditional CycleGAN},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Distortion-Aware Convolutional Filters for Dense Prediction in Panoramic Images
Tateno, Keisuke and Navab, Nassir and Tombari, Federico
[pdf]
[bibtex]
@InProceedings{Tateno_2018_ECCV,
author = {Tateno, Keisuke and Navab, Nassir and Tombari, Federico},
title = {Distortion-Aware Convolutional Filters for Dense Prediction in Panoramic Images},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Joint Camera Spectral Sensitivity Selection and Hyperspectral Image Recovery
Fu, Ying and Zhang, Tao and Zheng, Yinqiang and Zhang, Debing and Huang, Hua
[pdf]
[bibtex]
@InProceedings{Fu_2018_ECCV,
author = {Fu, Ying and Zhang, Tao and Zheng, Yinqiang and Zhang, Debing and Huang, Hua},
title = {Joint Camera Spectral Sensitivity Selection and Hyperspectral Image Recovery},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Monocular Depth Estimation Using Whole Strip Masking and Reliability-Based Refinement
Heo, Minhyeok and Lee, Jaehan and Kim, Kyung-Rae and Kim, Han-Ul and Kim, Chang-Su
[pdf]
[bibtex]
@InProceedings{Heo_2018_ECCV,
author = {Heo, Minhyeok and Lee, Jaehan and Kim, Kyung-Rae and Kim, Han-Ul and Kim, Chang-Su},
title = {Monocular Depth Estimation Using Whole Strip Masking and Reliability-Based Refinement},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Analyzing Clothing Layer Deformation Statistics of 3D Human Motions
Yang, Jinlong and Franco, Jean-Sebastien and Hetroy-Wheeler, Franck and Wuhrer, Stefanie
[pdf]
[bibtex]
@InProceedings{Yang_2018_ECCV,
author = {Yang, Jinlong and Franco, Jean-Sebastien and Hetroy-Wheeler, Franck and Wuhrer, Stefanie},
title = {Analyzing Clothing Layer Deformation Statistics of 3D Human Motions},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Image Super-Resolution Using Very Deep Residual Channel Attention Networks
Zhang, Yulun and Li, Kunpeng and Li, Kai and Wang, Lichen and Zhong, Bineng and Fu, Yun
[pdf]
[bibtex]
@InProceedings{Zhang_2018_ECCV,
author = {Zhang, Yulun and Li, Kunpeng and Li, Kai and Wang, Lichen and Zhong, Bineng and Fu, Yun},
title = {Image Super-Resolution Using Very Deep Residual Channel Attention Networks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Semi-Supervised Generative Adversarial Hashing for Image Retrieval
Wang, Guan'an and Hu, Qinghao and Cheng, Jian and Hou, Zengguang
[pdf]
[bibtex]
@InProceedings{Wang_2018_ECCV,
author = {Wang, Guan'an and Hu, Qinghao and Cheng, Jian and Hou, Zengguang},
title = {Semi-Supervised Generative Adversarial Hashing for Image Retrieval},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Learning Single-View 3D Reconstruction with Limited Pose Supervision
Yang, Guandao and Cui, Yin and Belongie, Serge and Hariharan, Bharath
[pdf]
[bibtex]
@InProceedings{Yang_2018_ECCV,
author = {Yang, Guandao and Cui, Yin and Belongie, Serge and Hariharan, Bharath},
title = {Learning Single-View 3D Reconstruction with Limited Pose Supervision},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Materials for Masses: SVBRDF Acquisition with a Single Mobile Phone Image
Li, Zhengqin and Sunkavalli, Kalyan and Chandraker, Manmohan
[pdf]
[bibtex]
@InProceedings{Li_2018_ECCV,
author = {Li, Zhengqin and Sunkavalli, Kalyan and Chandraker, Manmohan},
title = {Materials for Masses: SVBRDF Acquisition with a Single Mobile Phone Image},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Multi-Scale Spatially-Asymmetric Recalibration for Image Classification
Wang, Yan and Xie, Lingxi and Qiao, Siyuan and Zhang, Ya and Zhang, Wenjun and Yuille, Alan L.
[pdf]
[bibtex]
@InProceedings{Wang_2018_ECCV,
author = {Wang, Yan and Xie, Lingxi and Qiao, Siyuan and Zhang, Ya and Zhang, Wenjun and Yuille, Alan L.},
title = {Multi-Scale Spatially-Asymmetric Recalibration for Image Classification},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Graph Adaptive Knowledge Transfer for Unsupervised Domain Adaptation
Ding, Zhengming and Li, Sheng and Shao, Ming and Fu, Yun
[pdf]
[bibtex]
@InProceedings{Ding_2018_ECCV,
author = {Ding, Zhengming and Li, Sheng and Shao, Ming and Fu, Yun},
title = {Graph Adaptive Knowledge Transfer for Unsupervised Domain Adaptation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Improving Sequential Determinantal Point Processes for Supervised Video Summarization
Sharghi, Aidean and Borji, Ali and Li, Chengtao and Yang, Tianbao and Gong, Boqing
[pdf]
[bibtex]
@InProceedings{Sharghi_2018_ECCV,
author = {Sharghi, Aidean and Borji, Ali and Li, Chengtao and Yang, Tianbao and Gong, Boqing},
title = {Improving Sequential Determinantal Point Processes for Supervised Video Summarization},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Specular-to-Diffuse Translation for Multi-View Reconstruction
Wu, Shihao and Huang, Hui and Portenier, Tiziano and Sela, Matan and Cohen-Or, Daniel and Kimmel, Ron and Zwicker, Matthias
[pdf]
[bibtex]
@InProceedings{Wu_2018_ECCV,
author = {Wu, Shihao and Huang, Hui and Portenier, Tiziano and Sela, Matan and Cohen-Or, Daniel and Kimmel, Ron and Zwicker, Matthias},
title = {Specular-to-Diffuse Translation for Multi-View Reconstruction},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

RESOUND: Towards Action Recognition without Representation Bias
Li, Yingwei and Li, Yi and Vasconcelos, Nuno
[pdf]
[bibtex]
@InProceedings{Li_2018_ECCV,
author = {Li, Yingwei and Li, Yi and Vasconcelos, Nuno},
title = {RESOUND: Towards Action Recognition without Representation Bias},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

A Framework for Evaluating 6-DOF Object Trackers
Garon, Mathieu and Laurendeau, Denis and Lalonde, Jean-Francois
[pdf]
[bibtex]
@InProceedings{Garon_2018_ECCV,
author = {Garon, Mathieu and Laurendeau, Denis and Lalonde, Jean-Francois},
title = {A Framework for Evaluating 6-DOF Object Trackers},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Extending Layered Models to 3D Motion
Lao, Dong and Sundaramoorthi, Ganesh
[pdf]
[bibtex]
@InProceedings{Lao_2018_ECCV,
author = {Lao, Dong and Sundaramoorthi, Ganesh},
title = {Extending Layered Models to 3D Motion},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Long-term Tracking in the Wild: a Benchmark
Valmadre, Jack and Bertinetto, Luca and Henriques, Joao F. and Tao, Ran and Vedaldi, Andrea and Smeulders, Arnold W.M. and Torr, Philip H.S. and Gavves, Efstratios
[pdf]
[bibtex]
@InProceedings{Valmadre_2018_ECCV,
author = {Valmadre, Jack and Bertinetto, Luca and Henriques, Joao F. and Tao, Ran and Vedaldi, Andrea and Smeulders, Arnold W.M. and Torr, Philip H.S. and Gavves, Efstratios},
title = {Long-term Tracking in the Wild: a Benchmark},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Human Motion Analysis with Deep Metric Learning
Coskun, Huseyin and Joseph Tan, David and Conjeti, Sailesh and Navab, Nassir and Tombari, Federico
[pdf]
[bibtex]
@InProceedings{Coskun_2018_ECCV,
author = {Coskun, Huseyin and Joseph Tan, David and Conjeti, Sailesh and Navab, Nassir and Tombari, Federico},
title = {Human Motion Analysis with Deep Metric Learning},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Adaptive Affinity Fields for Semantic Segmentation
Ke, Tsung-Wei and Hwang, Jyh-Jing and Liu, Ziwei and Yu, Stella X.
[pdf]
[bibtex]
@InProceedings{Ke_2018_ECCV,
author = {Ke, Tsung-Wei and Hwang, Jyh-Jing and Liu, Ziwei and Yu, Stella X.},
title = {Adaptive Affinity Fields for Semantic Segmentation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Hierarchy of Alternating Specialists for Scene Recognition
Jin Kim, Hyo and Frahm, Jan-Michael
[pdf]
[bibtex]
@InProceedings{Kim_2018_ECCV,
author = {Jin Kim, Hyo and Frahm, Jan-Michael},
title = {Hierarchy of Alternating Specialists for Scene Recognition},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Multi-Scale Structure-Aware Network for Human Pose Estimation
Ke, Lipeng and Chang, Ming-Ching and Qi, Honggang and Lyu, Siwei
[pdf]
[bibtex]
@InProceedings{Ke_2018_ECCV,
author = {Ke, Lipeng and Chang, Ming-Ching and Qi, Honggang and Lyu, Siwei},
title = {Multi-Scale Structure-Aware Network for Human Pose Estimation },
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

License Plate Detection and Recognition in Unconstrained Scenarios
Montazzolli Silva, Sergio and Rosito Jung, Claudio
[pdf]
[bibtex]
@InProceedings{Silva_2018_ECCV,
author = {Montazzolli Silva, Sergio and Rosito Jung, Claudio},
title = {License Plate Detection and Recognition in Unconstrained Scenarios},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Where Will They Go? Predicting Fine-Grained Adversarial Multi-Agent Motion using Conditional Variational Autoencoders
Felsen, Panna and Lucey, Patrick and Ganguly, Sujoy
[pdf]
[bibtex]
@InProceedings{Felsen_2018_ECCV,
author = {Felsen, Panna and Lucey, Patrick and Ganguly, Sujoy},
title = {Where Will They Go? Predicting Fine-Grained Adversarial Multi-Agent Motion using Conditional Variational Autoencoders},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deep Multi-Task Learning to Recognise Subtle Facial Expressions of Mental States
Hu, Guosheng and Liu, Li and Yuan, Yang and Yu, Zehao and Hua, Yang and Zhang, Zhihong and Shen, Fumin and Shao, Ling and Hospedales, Timothy and Robertson, Neil and Yang, Yongxin
[pdf]
[bibtex]
@InProceedings{Hu_2018_ECCV,
author = {Hu, Guosheng and Liu, Li and Yuan, Yang and Yu, Zehao and Hua, Yang and Zhang, Zhihong and Shen, Fumin and Shao, Ling and Hospedales, Timothy and Robertson, Neil and Yang, Yongxin},
title = {Deep Multi-Task Learning to Recognise Subtle Facial Expressions of Mental States},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

PlaneMatch: Patch Coplanarity Prediction for Robust RGB-D Reconstruction
Shi, Yifei and Xu, Kai and Niessner, Matthias and Rusinkiewicz, Szymon and Funkhouser, Thomas
[pdf]
[bibtex]
@InProceedings{Shi_2018_ECCV,
author = {Shi, Yifei and Xu, Kai and Niessner, Matthias and Rusinkiewicz, Szymon and Funkhouser, Thomas},
title = {PlaneMatch: Patch Coplanarity Prediction for Robust RGB-D Reconstruction},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

PPF-FoldNet: Unsupervised Learning of Rotation Invariant 3D Local Descriptors
Deng, Haowen and Birdal, Tolga and Ilic, Slobodan
[pdf]
[bibtex]
@InProceedings{Deng_2018_ECCV,
author = {Deng, Haowen and Birdal, Tolga and Ilic, Slobodan},
title = {PPF-FoldNet: Unsupervised Learning of Rotation Invariant 3D Local Descriptors},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

HBE: Hand Branch Ensemble Network for Real-time 3D Hand Pose Estimation
Zhou, Yidan and Lu, Jian and Du, Kuo and Lin, Xiangbo and Sun, Yi and Ma, Xiaohong
[pdf]
[bibtex]
@InProceedings{Zhou_2018_ECCV,
author = {Zhou, Yidan and Lu, Jian and Du, Kuo and Lin, Xiangbo and Sun, Yi and Ma, Xiaohong},
title = {HBE: Hand Branch Ensemble Network for Real-time 3D Hand Pose Estimation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

ShapeStacks: Learning Vision-Based Physical Intuition for Generalised Object Stacking
Groth, Oliver and Fuchs, Fabian B. and Posner, Ingmar and Vedaldi, Andrea
[pdf]
[bibtex]
@InProceedings{Groth_2018_ECCV,
author = {Groth, Oliver and Fuchs, Fabian B. and Posner, Ingmar and Vedaldi, Andrea},
title = {ShapeStacks: Learning Vision-Based Physical Intuition for Generalised Object Stacking},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Grassmann Pooling as Compact Homogeneous Bilinear Pooling for Fine-Grained Visual Classification
Wei, Xing and Zhang, Yue and Gong, Yihong and Zhang, Jiawei and Zheng, Nanning
[pdf]
[bibtex]
@InProceedings{Wei_2018_ECCV,
author = {Wei, Xing and Zhang, Yue and Gong, Yihong and Zhang, Jiawei and Zheng, Nanning},
title = {Grassmann Pooling as Compact Homogeneous Bilinear Pooling for Fine-Grained Visual Classification},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deep Generative Models for Weakly-Supervised Multi-Label Classification
Chu, Hong-Min and Yeh, Chih-Kuan and Frank Wang, Yu-Chiang
[pdf]
[bibtex]
@InProceedings{Chu_2018_ECCV,
author = {Chu, Hong-Min and Yeh, Chih-Kuan and Frank Wang, Yu-Chiang},
title = {Deep Generative Models for Weakly-Supervised Multi-Label Classification},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

SRDA: Generating Instance Segmentation Annotation via Scanning, Reasoning and Domain Adaptation
Xu, Wenqiang and Li, Yonglu and Lu, Cewu
[pdf]
[bibtex]
@InProceedings{Xu_2018_ECCV,
author = {Xu, Wenqiang and Li, Yonglu and Lu, Cewu},
title = {SRDA: Generating Instance Segmentation Annotation via Scanning, Reasoning and Domain Adaptation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

MPLP++: Fast, Parallel Dual Block-Coordinate Ascent for Dense Graphical Models
Tourani, Siddharth and Shekhovtsov, Alexander and Rother, Carsten and Savchynskyy, Bogdan
[pdf]
[bibtex]
@InProceedings{Tourani_2018_ECCV,
author = {Tourani, Siddharth and Shekhovtsov, Alexander and Rother, Carsten and Savchynskyy, Bogdan},
title = {MPLP++: Fast, Parallel Dual Block-Coordinate Ascent for Dense Graphical Models},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Unsupervised Video Object Segmentation using Motion Saliency-Guided Spatio-Temporal Propagation
Hu, Yuan-Ting and Huang, Jia-Bin and Schwing, Alexander G.
[pdf]
[bibtex]
@InProceedings{Hu_2018_ECCV,
author = {Hu, Yuan-Ting and Huang, Jia-Bin and Schwing, Alexander G.},
title = {Unsupervised Video Object Segmentation using Motion Saliency-Guided Spatio-Temporal Propagation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Semi-Supervised Deep Learning with Memory
Chen, Yanbei and Zhu, Xiatian and Gong, Shaogang
[pdf]
[bibtex]
@InProceedings{Chen_2018_ECCV,
author = {Chen, Yanbei and Zhu, Xiatian and Gong, Shaogang},
title = {Semi-Supervised Deep Learning with Memory},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deep Reinforcement Learning with Iterative Shift for Visual Tracking
Ren, Liangliang and Yuan, Xin and Lu, Jiwen and Yang, Ming and Zhou, Jie
[pdf]
[bibtex]
@InProceedings{Ren_2018_ECCV,
author = {Ren, Liangliang and Yuan, Xin and Lu, Jiwen and Yang, Ming and Zhou, Jie},
title = {Deep Reinforcement Learning with Iterative Shift for Visual Tracking},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

X2Face: A network for controlling face generation using images, audio, and pose codes
Wiles, Olivia and Sophia Koepke, A. and Zisserman, Andrew
[pdf]
[bibtex]
@InProceedings{Wiles_2018_ECCV,
author = {Wiles, Olivia and Sophia Koepke, A. and Zisserman, Andrew},
title = {X2Face: A network for controlling face generation using images, audio, and pose codes},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Correcting the Triplet Selection Bias for Triplet Loss
Yu, Baosheng and Liu, Tongliang and Gong, Mingming and Ding, Changxing and Tao, Dacheng
[pdf]
[bibtex]
@InProceedings{Yu_2018_ECCV,
author = {Yu, Baosheng and Liu, Tongliang and Gong, Mingming and Ding, Changxing and Tao, Dacheng},
title = {Correcting the Triplet Selection Bias for Triplet Loss},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Women also Snowboard: Overcoming Bias in Captioning Models
Anne Hendricks, Lisa and Burns, Kaylee and Saenko, Kate and Darrell, Trevor and Rohrbach, Anna
[pdf]
[bibtex]
@InProceedings{Hendricks_2018_ECCV,
author = {Anne Hendricks, Lisa and Burns, Kaylee and Saenko, Kate and Darrell, Trevor and Rohrbach, Anna},
title = {Women also Snowboard: Overcoming Bias in Captioning Models},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

GAL: Geometric Adversarial Loss for Single-View 3D-Object Reconstruction
Jiang, Li and Shi, Shaoshuai and Qi, Xiaojuan and Jia, Jiaya
[pdf]
[bibtex]
@InProceedings{Jiang_2018_ECCV,
author = {Jiang, Li and Shi, Shaoshuai and Qi, Xiaojuan and Jia, Jiaya},
title = {GAL: Geometric Adversarial Loss for Single-View 3D-Object Reconstruction},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Contextual-based Image Inpainting: Infer, Match, and Translate
Song, Yuhang and Yang, Chao and Lin, Zhe and Liu, Xiaofeng and Huang, Qin and Li, Hao and Jay Kuo, C.-C.
[pdf]
[bibtex]
@InProceedings{Song_2018_ECCV,
author = {Song, Yuhang and Yang, Chao and Lin, Zhe and Liu, Xiaofeng and Huang, Qin and Li, Hao and Jay Kuo, C.-C.},
title = {Contextual-based Image Inpainting: Infer, Match, and Translate},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Inner Space Preserving Generative Pose Machine
Liu, Shuangjun and Ostadabbas, Sarah
[pdf]
[bibtex]
@InProceedings{Liu_2018_ECCV,
author = {Liu, Shuangjun and Ostadabbas, Sarah},
title = {Inner Space Preserving Generative Pose Machine},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

SOD-MTGAN: Small Object Detection via Multi-Task Generative Adversarial Network
Bai, Yancheng and Zhang, Yongqiang and Ding, Mingli and Ghanem, Bernard
[pdf]
[bibtex]
@InProceedings{Bai_2018_ECCV,
author = {Bai, Yancheng and Zhang, Yongqiang and Ding, Mingli and Ghanem, Bernard},
title = {SOD-MTGAN: Small Object Detection via Multi-Task Generative Adversarial Network},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Dependency-aware Attention Control for Unconstrained Face Recognition with Image Sets
Liu, Xiaofeng and Vijaya Kumar, B.V.K and Yang, Chao and Tang, Qingming and You, Jane
[pdf]
[bibtex]
@InProceedings{Liu_2018_ECCV,
author = {Liu, Xiaofeng and Vijaya Kumar, B.V.K and Yang, Chao and Tang, Qingming and You, Jane},
title = {Dependency-aware Attention Control for Unconstrained Face Recognition with Image Sets},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

End-to-end View Synthesis for Light Field Imaging with Pseudo 4DCNN
Wang, Yunlong and Liu, Fei and Wang, Zilei and Hou, Guangqi and Sun, Zhenan and Tan, Tieniu
[pdf]
[bibtex]
@InProceedings{Wang_2018_ECCV,
author = {Wang, Yunlong and Liu, Fei and Wang, Zilei and Hou, Guangqi and Sun, Zhenan and Tan, Tieniu},
title = {End-to-end View Synthesis for Light Field Imaging with Pseudo 4DCNN},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Iterative Crowd Counting
Ranjan, Viresh and Le, Hieu and Hoai, Minh
[pdf]
[bibtex]
@InProceedings{Ranjan_2018_ECCV,
author = {Ranjan, Viresh and Le, Hieu and Hoai, Minh},
title = {Iterative Crowd Counting},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

DeepPhys: Video-Based Physiological Measurement Using Convolutional Attention Networks
Chen, Weixuan and McDuff, Daniel
[pdf]
[bibtex]
@InProceedings{Chen_2018_ECCV,
author = {Chen, Weixuan and McDuff, Daniel},
title = {DeepPhys: Video-Based Physiological Measurement Using Convolutional Attention Networks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

On the Solvability of Viewing Graphs
Trager, Matthew and Osserman, Brian and Ponce, Jean
[pdf]
[bibtex]
@InProceedings{Trager_2018_ECCV,
author = {Trager, Matthew and Osserman, Brian and Ponce, Jean},
title = {On the Solvability of Viewing Graphs},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

A Systematic DNN Weight Pruning Framework using Alternating Direction Method of Multipliers
Zhang, Tianyun and Ye, Shaokai and Zhang, Kaiqi and Tang, Jian and Wen, Wujie and Fardad, Makan and Wang, Yanzhi
[pdf]
[bibtex]
@InProceedings{Zhang_2018_ECCV,
author = {Zhang, Tianyun and Ye, Shaokai and Zhang, Kaiqi and Tang, Jian and Wen, Wujie and Fardad, Makan and Wang, Yanzhi},
title = {A Systematic DNN Weight Pruning Framework using Alternating Direction Method of Multipliers},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Multimodal Dual Attention Memory for Video Story Question Answering
Kim, Kyung-Min and Choi, Seong-Ho and Kim, Jin-Hwa and Zhang, Byoung-Tak
[pdf]
[bibtex]
@InProceedings{Kim_2018_ECCV,
author = {Kim, Kyung-Min and Choi, Seong-Ho and Kim, Jin-Hwa and Zhang, Byoung-Tak},
title = {Multimodal Dual Attention Memory for Video Story Question Answering},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

SAN: Learning Relationship between Convolutional Features for Multi-Scale Object Detection
Kim, Yonghyun and Kang, Bong-Nam and Kim, Daijin
[pdf]
[bibtex]
@InProceedings{Kim_2018_ECCV,
author = {Kim, Yonghyun and Kang, Bong-Nam and Kim, Daijin},
title = {SAN: Learning Relationship between Convolutional Features for Multi-Scale Object Detection},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Single Shot Scene Text Retrieval
Gomez, Lluis and Mafla, Andres and Rusinol, Marcal and Karatzas, Dimosthenis
[pdf]
[bibtex]
@InProceedings{Gomez_2018_ECCV,
author = {Gomez, Lluis and Mafla, Andres and Rusinol, Marcal and Karatzas, Dimosthenis},
title = {Single Shot Scene Text Retrieval},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Dynamic Task Prioritization for Multitask Learning
Guo, Michelle and Haque, Albert and Huang, De-An and Yeung, Serena and Fei-Fei, Li
[pdf]
[bibtex]
@InProceedings{Guo_2018_ECCV,
author = {Guo, Michelle and Haque, Albert and Huang, De-An and Yeung, Serena and Fei-Fei, Li},
title = {Dynamic Task Prioritization for Multitask Learning},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Self-supervised Knowledge Distillation Using Singular Value Decomposition
Hyun Lee, Seung and Ha Kim, Dae and Cheol Song, Byung
[pdf]
[bibtex]
@InProceedings{Lee_2018_ECCV,
author = {Hyun Lee, Seung and Ha Kim, Dae and Cheol Song, Byung},
title = {Self-supervised Knowledge Distillation Using Singular Value Decomposition},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Transductive Centroid Projection for Semi-supervised Large-scale Recognition
Liu, Yu and Song, Guanglu and Shao, Jing and Jin, Xiao and Wang, Xiaogang
[pdf]
[bibtex]
@InProceedings{Liu_2018_ECCV,
author = {Liu, Yu and Song, Guanglu and Shao, Jing and Jin, Xiao and Wang, Xiaogang},
title = {Transductive Centroid Projection for Semi-supervised Large-scale Recognition},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deep Shape Matching
Radenovic, Filip and Tolias, Giorgos and Chum, Ondrej
[pdf]
[bibtex]
@InProceedings{Radenovic_2018_ECCV,
author = {Radenovic, Filip and Tolias, Giorgos and Chum, Ondrej},
title = {Deep Shape Matching},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Fast, Accurate, and Lightweight Super-Resolution with Cascading Residual Network
Ahn, Namhyuk and Kang, Byungkon and Sohn, Kyung-Ah
[pdf]
[bibtex]
@InProceedings{Ahn_2018_ECCV,
author = {Ahn, Namhyuk and Kang, Byungkon and Sohn, Kyung-Ah},
title = {Fast, Accurate, and Lightweight Super-Resolution with Cascading Residual Network},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

CIRL: Controllable Imitative Reinforcement Learning for Vision-based Self-driving
Liang, Xiaodan and Wang, Tairui and Yang, Luona and Xing, Eric
[pdf]
[bibtex]
@InProceedings{Liang_2018_ECCV,
author = {Liang, Xiaodan and Wang, Tairui and Yang, Luona and Xing, Eric},
title = {CIRL: Controllable Imitative Reinforcement Learning for Vision-based Self-driving},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

EC-Net: an Edge-aware Point set Consolidation Network
Yu, Lequan and Li, Xianzhi and Fu, Chi-Wing and Cohen-Or, Daniel and Heng, Pheng-Ann
[pdf]
[bibtex]
@InProceedings{Yu_2018_ECCV,
author = {Yu, Lequan and Li, Xianzhi and Fu, Chi-Wing and Cohen-Or, Daniel and Heng, Pheng-Ann},
title = {EC-Net: an Edge-aware Point set Consolidation Network},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Part-Activated Deep Reinforcement Learning for Action Prediction
Chen, Lei and Lu, Jiwen and Song, Zhanjie and Zhou, Jie
[pdf]
[bibtex]
@InProceedings{Chen_2018_ECCV,
author = {Chen, Lei and Lu, Jiwen and Song, Zhanjie and Zhou, Jie},
title = {Part-Activated Deep Reinforcement Learning for Action Prediction},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Learning to Navigate for Fine-grained Classification
Yang, Ze and Luo, Tiange and Wang, Dong and Hu, Zhiqiang and Gao, Jun and Wang, Liwei
[pdf]
[bibtex]
@InProceedings{Yang_2018_ECCV,
author = {Yang, Ze and Luo, Tiange and Wang, Dong and Hu, Zhiqiang and Gao, Jun and Wang, Liwei},
title = {Learning to Navigate for Fine-grained Classification},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Single Image Highlight Removal with a Sparse and Low-Rank Reflection Model
Guo, Jie and Zhou, Zuojian and Wang, Limin
[pdf]
[bibtex]
@InProceedings{Guo_2018_ECCV,
author = {Guo, Jie and Zhou, Zuojian and Wang, Limin},
title = {Single Image Highlight Removal with a Sparse and Low-Rank Reflection Model},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Improving Shape Deformation in Unsupervised Image-to-Image Translation
Gokaslan, Aaron and Ramanujan, Vivek and Ritchie, Daniel and In Kim, Kwang and Tompkin, James
[pdf]
[bibtex]
@InProceedings{Gokaslan_2018_ECCV,
author = {Gokaslan, Aaron and Ramanujan, Vivek and Ritchie, Daniel and In Kim, Kwang and Tompkin, James},
title = {Improving Shape Deformation in Unsupervised Image-to-Image Translation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Scalable Exemplar-based Subspace Clustering on Class-Imbalanced Data
You, Chong and Li, Chi and Robinson, Daniel P. and Vidal, Rene
[pdf]
[bibtex]
@InProceedings{You_2018_ECCV,
author = {You, Chong and Li, Chi and Robinson, Daniel P. and Vidal, Rene},
title = {Scalable Exemplar-based Subspace Clustering on Class-Imbalanced Data},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

3D Ego-Pose Estimation via Imitation Learning
Yuan, Ye and Kitani, Kris
[pdf]
[bibtex]
@InProceedings{Yuan_2018_ECCV,
author = {Yuan, Ye and Kitani, Kris},
title = {3D Ego-Pose Estimation via Imitation Learning},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Visual Coreference Resolution in Visual Dialog using Neural Module Networks
Kottur, Satwik and Moura, Jose M. F. and Parikh, Devi and Batra, Dhruv and Rohrbach, Marcus
[pdf]
[bibtex]
@InProceedings{Kottur_2018_ECCV,
author = {Kottur, Satwik and Moura, Jose M. F. and Parikh, Devi and Batra, Dhruv and Rohrbach, Marcus},
title = {Visual Coreference Resolution in Visual Dialog using Neural Module Networks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

LSQ++: Lower running time and higher recall in multi-codebook quantization
Martinez, Julieta and Zakhmi, Shobhit and Hoos, Holger H. and Little, James J.
[pdf]
[bibtex]
@InProceedings{Martinez_2018_ECCV,
author = {Martinez, Julieta and Zakhmi, Shobhit and Hoos, Holger H. and Little, James J.},
title = {LSQ++: Lower running time and higher recall in multi-codebook quantization},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

A Hybrid Model for Identity Obfuscation by Face Replacement
Sun, Qianru and Tewari, Ayush and Xu, Weipeng and Fritz, Mario and Theobalt, Christian and Schiele, Bernt
[pdf]
[bibtex]
@InProceedings{Sun_2018_ECCV,
author = {Sun, Qianru and Tewari, Ayush and Xu, Weipeng and Fritz, Mario and Theobalt, Christian and Schiele, Bernt},
title = {A Hybrid Model for Identity Obfuscation by Face Replacement},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Depth-aware CNN for RGB-D Segmentation
Wang, Weiyue and Neumann, Ulrich
[pdf]
[bibtex]
@InProceedings{Wang_2018_ECCV,
author = {Wang, Weiyue and Neumann, Ulrich},
title = {Depth-aware CNN for RGB-D Segmentation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation
Yu, Changqian and Wang, Jingbo and Peng, Chao and Gao, Changxin and Yu, Gang and Sang, Nong
[pdf]
[bibtex]
@InProceedings{Yu_2018_ECCV,
author = {Yu, Changqian and Wang, Jingbo and Peng, Chao and Gao, Changxin and Yu, Gang and Sang, Nong},
title = {BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

ActiveStereoNet: End-to-End Self-Supervised Learning for Active Stereo Systems
Zhang, Yinda and Khamis, Sameh and Rhemann, Christoph and Valentin, Julien and Kowdle, Adarsh and Tankovich, Vladimir and Schoenberg, Michael and Izadi, Shahram and Funkhouser, Thomas and Fanello, Sean
[pdf]
[bibtex]
@InProceedings{Zhang_2018_ECCV,
author = {Zhang, Yinda and Khamis, Sameh and Rhemann, Christoph and Valentin, Julien and Kowdle, Adarsh and Tankovich, Vladimir and Schoenberg, Michael and Izadi, Shahram and Funkhouser, Thomas and Fanello, Sean},
title = {ActiveStereoNet: End-to-End Self-Supervised Learning for Active Stereo Systems},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Weakly- and Semi-Supervised Panoptic Segmentation
Li, Qizhu and Arnab, Anurag and Torr, Philip H.S.
[pdf]
[bibtex]
@InProceedings{Li_2018_ECCV,
author = {Li, Qizhu and Arnab, Anurag and Torr, Philip H.S.},
title = {Weakly- and Semi-Supervised Panoptic Segmentation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Selfie Video Stabilization
Yu, Jiyang and Ramamoorthi, Ravi
[pdf]
[bibtex]
@InProceedings{Yu_2018_ECCV,
author = {Yu, Jiyang and Ramamoorthi, Ravi},
title = {Selfie Video Stabilization},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Double JPEG Detection in Mixed JPEG Quality Factors using Deep Convolutional Neural Network
Park, Jinseok and Cho, Donghyeon and Ahn, Wonhyuk and Lee, Heung-Kyu
[pdf]
[bibtex]
@InProceedings{Park_2018_ECCV,
author = {Park, Jinseok and Cho, Donghyeon and Ahn, Wonhyuk and Lee, Heung-Kyu},
title = {Double JPEG Detection in Mixed JPEG Quality Factors using Deep Convolutional Neural Network},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Incremental Multi-graph Matching via Diversity and Randomness based Graph Clustering
Yu, Tianshu and Yan, Junchi and Liu, Wei and Li, Baoxin
[pdf]
[bibtex]
@InProceedings{Yu_2018_ECCV,
author = {Yu, Tianshu and Yan, Junchi and Liu, Wei and Li, Baoxin},
title = {Incremental Multi-graph Matching via Diversity and Randomness based Graph Clustering},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

DeepTAM: Deep Tracking and Mapping
Zhou, Huizhong and Ummenhofer, Benjamin and Brox, Thomas
[pdf]
[bibtex]
@InProceedings{Zhou_2018_ECCV,
author = {Zhou, Huizhong and Ummenhofer, Benjamin and Brox, Thomas},
title = {DeepTAM: Deep Tracking and Mapping},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

R2P2: A ReparameteRized Pushforward Policy for Diverse, Precise Generative Path Forecasting
Rhinehart, Nicholas and Kitani, Kris M. and Vernaza, Paul
[pdf]
[bibtex]
@InProceedings{Rhinehart_2018_ECCV,
author = {Rhinehart, Nicholas and Kitani, Kris M. and Vernaza, Paul},
title = {R2P2: A ReparameteRized Pushforward Policy for Diverse, Precise Generative Path Forecasting},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

SpiderCNN: Deep Learning on Point Sets with Parameterized Convolutional Filters
Xu, Yifan and Fan, Tianqi and Xu, Mingye and Zeng, Long and Qiao, Yu
[pdf]
[bibtex]
@InProceedings{Xu_2018_ECCV,
author = {Xu, Yifan and Fan, Tianqi and Xu, Mingye and Zeng, Long and Qiao, Yu},
title = {SpiderCNN: Deep Learning on Point Sets with Parameterized Convolutional Filters},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

CurriculumNet: Weakly Supervised Learning from Large-Scale Web Images
Guo, Sheng and Huang, Weilin and Zhang, Haozhi and Zhuang, Chenfan and Dong, Dengke and Scott, Matthew R. and Huang, Dinglong
[pdf]
[bibtex]
@InProceedings{Guo_2018_ECCV,
author = {Guo, Sheng and Huang, Weilin and Zhang, Haozhi and Zhuang, Chenfan and Dong, Dengke and Scott, Matthew R. and Huang, Dinglong},
title = {CurriculumNet: Weakly Supervised Learning from Large-Scale Web Images},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Occlusions, Motion and Depth Boundaries with a Generic Network for Disparity, Optical Flow or Scene Flow Estimation
Ilg, Eddy and Saikia, Tonmoy and Keuper, Margret and Brox, Thomas
[pdf]
[bibtex]
@InProceedings{Ilg_2018_ECCV,
author = {Ilg, Eddy and Saikia, Tonmoy and Keuper, Margret and Brox, Thomas},
title = {Occlusions, Motion and Depth Boundaries with a Generic Network for Disparity, Optical Flow or Scene Flow Estimation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Quantization Mimic: Towards Very Tiny CNN for Object Detection
Wei, Yi and Pan, Xinyu and Qin, Hongwei and Ouyang, Wanli and Yan, Junjie
[pdf]
[bibtex]
@InProceedings{Wei_2018_ECCV,
author = {Wei, Yi and Pan, Xinyu and Qin, Hongwei and Ouyang, Wanli and Yan, Junjie},
title = {Quantization Mimic: Towards Very Tiny CNN for Object Detection},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Learning Rigidity in Dynamic Scenes with a Moving Camera for 3D Motion Field Estimation
Lv, Zhaoyang and Kim, Kihwan and Troccoli, Alejandro and Sun, Deqing and Rehg, James M. and Kautz, Jan
[pdf]
[bibtex]
@InProceedings{Lv_2018_ECCV,
author = {Lv, Zhaoyang and Kim, Kihwan and Troccoli, Alejandro and Sun, Deqing and Rehg, James M. and Kautz, Jan},
title = {Learning Rigidity in Dynamic Scenes with a Moving Camera for 3D Motion Field Estimation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Proximal Dehaze-Net: A Prior Learning-Based Deep Network for Single Image Dehazing
Yang, Dong and Sun, Jian
[pdf]
[bibtex]
@InProceedings{Yang_2018_ECCV,
author = {Yang, Dong and Sun, Jian},
title = {Proximal Dehaze-Net: A Prior Learning-Based Deep Network for Single Image Dehazing},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Textual Explanations for Self-Driving Vehicles
Kim, Jinkyu and Rohrbach, Anna and Darrell, Trevor and Canny, John and Akata, Zeynep
[pdf]
[bibtex]
@InProceedings{Kim_2018_ECCV,
author = {Kim, Jinkyu and Rohrbach, Anna and Darrell, Trevor and Canny, John and Akata, Zeynep},
title = {Textual Explanations for Self-Driving Vehicles},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Focus, Segment and Erase: An Efficient Network for Multi-Label Brain Tumor Segmentation
Chen, Xuan and Hao Liew, Jun and Xiong, Wei and Chui, Chee-Kong and Ong, Sim-Heng
[pdf]
[bibtex]
@InProceedings{Chen_2018_ECCV,
author = {Chen, Xuan and Hao Liew, Jun and Xiong, Wei and Chui, Chee-Kong and Ong, Sim-Heng},
title = {Focus, Segment and Erase: An Efficient Network for Multi-Label Brain Tumor Segmentation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Local Orthogonal-Group Testing
Iscen, Ahmet and Chum, Ondrej
[pdf]
[bibtex]
@InProceedings{Iscen_2018_ECCV,
author = {Iscen, Ahmet and Chum, Ondrej},
title = {Local Orthogonal-Group Testing},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Connecting Gaze, Scene, and Attention: Generalized Attention Estimation via Joint Modeling of Gaze and Scene Saliency
Chong, Eunji and Ruiz, Nataniel and Wang, Yongxin and Zhang, Yun and Rozga, Agata and Rehg, James M.
[pdf]
[bibtex]
@InProceedings{Chong_2018_ECCV,
author = {Chong, Eunji and Ruiz, Nataniel and Wang, Yongxin and Zhang, Yun and Rozga, Agata and Rehg, James M.},
title = {Connecting Gaze, Scene, and Attention: Generalized Attention Estimation via Joint Modeling of Gaze and Scene Saliency},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Show, Tell and Discriminate: Image Captioning by Self-retrieval with Partially Labeled Data
Liu, Xihui and Li, Hongsheng and Shao, Jing and Chen, Dapeng and Wang, Xiaogang
[pdf]
[bibtex]
@InProceedings{Liu_2018_ECCV,
author = {Liu, Xihui and Li, Hongsheng and Shao, Jing and Chen, Dapeng and Wang, Xiaogang},
title = {Show, Tell and Discriminate: Image Captioning by Self-retrieval with Partially Labeled Data},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

VideoMatch: Matching based Video Object Segmentation
Hu, Yuan-Ting and Huang, Jia-Bin and Schwing, Alexander G.
[pdf]
[bibtex]
@InProceedings{Hu_2018_ECCV,
author = {Hu, Yuan-Ting and Huang, Jia-Bin and Schwing, Alexander G.},
title = {VideoMatch: Matching based Video Object Segmentation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Unsupervised Video Object Segmentation with Motion-based Bilateral Networks
Li, Siyang and Seybold, Bryan and Vorobyov, Alexey and Lei, Xuejing and Jay Kuo, C.-C.
[pdf]
[bibtex]
@InProceedings{Li_2018_ECCV,
author = {Li, Siyang and Seybold, Bryan and Vorobyov, Alexey and Lei, Xuejing and Jay Kuo, C.-C.},
title = {Unsupervised Video Object Segmentation with Motion-based Bilateral Networks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

3D Vehicle Trajectory Reconstruction in Monocular Video Data Using Environment Structure Constraints
Bullinger, Sebastian and Bodensteiner, Christoph and Arens, Michael and Stiefelhagen, Rainer
[pdf]
[bibtex]
@InProceedings{Bullinger_2018_ECCV,
author = {Bullinger, Sebastian and Bodensteiner, Christoph and Arens, Michael and Stiefelhagen, Rainer},
title = {3D Vehicle Trajectory Reconstruction in Monocular Video Data Using Environment Structure Constraints},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Shuffle-Then-Assemble: Learning Object-Agnostic Visual Relationship Features
Yang, Xu and Zhang, Hanwang and Cai, Jianfei
[pdf]
[bibtex]
@InProceedings{Yang_2018_ECCV,
author = {Yang, Xu and Zhang, Hanwang and Cai, Jianfei},
title = {Shuffle-Then-Assemble: Learning Object-Agnostic Visual Relationship Features},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Revisiting the Inverted Indices for Billion-Scale Approximate Nearest Neighbors
Baranchuk, Dmitry and Babenko, Artem and Malkov, Yury
[pdf]
[bibtex]
@InProceedings{Baranchuk_2018_ECCV,
author = {Baranchuk, Dmitry and Babenko, Artem and Malkov, Yury},
title = {Revisiting the Inverted Indices for Billion-Scale Approximate Nearest Neighbors},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Eliminating the Blind Spot: Adapting 3D Object Detection and Monocular Depth Estimation to 360° Panoramic Imagery
Payen de La Garanderie, Greire and Atapour Abarghouei, Amir and Breckon, Toby P.
[pdf]
[bibtex]
@InProceedings{Garanderie_2018_ECCV,
author = {Payen de La Garanderie, Greire and Atapour Abarghouei, Amir and Breckon, Toby P.},
title = {Eliminating the Blind Spot: Adapting 3D Object Detection and Monocular Depth Estimation to 360° Panoramic Imagery},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Towards Realistic Predictors
Wang, Pei and Vasconcelos, Nuno
[pdf]
[bibtex]
@InProceedings{Wang_2018_ECCV,
author = {Wang, Pei and Vasconcelos, Nuno},
title = {Towards Realistic Predictors},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Learning Deep Representations with Probabilistic Knowledge Transfer
Passalis, Nikolaos and Tefas, Anastasios
[pdf]
[bibtex]
@InProceedings{Passalis_2018_ECCV,
author = {Passalis, Nikolaos and Tefas, Anastasios},
title = {Learning Deep Representations with Probabilistic Knowledge Transfer},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

DFT-based Transformation Invariant Pooling Layer for Visual Classification
Ryu, Jongbin and Yang, Ming-Hsuan and Lim, Jongwoo
[pdf]
[bibtex]
@InProceedings{Ryu_2018_ECCV,
author = {Ryu, Jongbin and Yang, Ming-Hsuan and Lim, Jongwoo},
title = {DFT-based Transformation Invariant Pooling Layer for Visual Classification},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Objects that Sound
Arandjelovic, Relja and Zisserman, Andrew
[pdf]
[bibtex]
@InProceedings{Arandjelovic_2018_ECCV,
author = {Arandjelovic, Relja and Zisserman, Andrew},
title = {Objects that Sound},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

End-to-End Incremental Learning
Castro, Francisco M. and Marin-Jimenez, Manuel J. and Guil, Nicolas and Schmid, Cordelia and Alahari, Karteek
[pdf]
[bibtex]
@InProceedings{Castro_2018_ECCV,
author = {Castro, Francisco M. and Marin-Jimenez, Manuel J. and Guil, Nicolas and Schmid, Cordelia and Alahari, Karteek},
title = {End-to-End Incremental Learning},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

SaaS: Speed as a Supervisor for Semi-supervised Learning
Cicek, Safa and Fawzi, Alhussein and Soatto, Stefano
[pdf]
[bibtex]
@InProceedings{Cicek_2018_ECCV,
author = {Cicek, Safa and Fawzi, Alhussein and Soatto, Stefano},
title = {SaaS: Speed as a Supervisor for Semi-supervised Learning},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deep Video Quality Assessor: From Spatio-temporal Visual Sensitivity to A Convolutional Neural Aggregation Network
Kim, Woojae and Kim, Jongyoo and Ahn, Sewoong and Kim, Jinwoo and Lee, Sanghoon
[pdf]
[bibtex]
@InProceedings{Kim_2018_ECCV,
author = {Kim, Woojae and Kim, Jongyoo and Ahn, Sewoong and Kim, Jinwoo and Lee, Sanghoon},
title = {Deep Video Quality Assessor: From Spatio-temporal Visual Sensitivity to A Convolutional Neural Aggregation Network},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Straight to the Facts: Learning Knowledge Base Retrieval for Factual Visual Question Answering
Narasimhan, Medhini and Schwing, Alexander G.
[pdf]
[bibtex]
@InProceedings{Narasimhan_2018_ECCV,
author = {Narasimhan, Medhini and Schwing, Alexander G.},
title = {Straight to the Facts: Learning Knowledge Base Retrieval for Factual Visual Question Answering},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deep Volumetric Video From Very Sparse Multi-View Performance Capture
Huang, Zeng and Li, Tianye and Chen, Weikai and Zhao, Yajie and Xing, Jun and LeGendre, Chloe and Luo, Linjie and Ma, Chongyang and Li, Hao
[pdf]
[bibtex]
@InProceedings{Huang_2018_ECCV,
author = {Huang, Zeng and Li, Tianye and Chen, Weikai and Zhao, Yajie and Xing, Jun and LeGendre, Chloe and Luo, Linjie and Ma, Chongyang and Li, Hao},
title = {Deep Volumetric Video From Very Sparse Multi-View Performance Capture},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Neural Procedural Reconstruction for Residential Buildings
Zeng, Huayi and Wu, Jiaye and Furukawa, Yasutaka
[pdf]
[bibtex]
@InProceedings{Zeng_2018_ECCV,
author = {Zeng, Huayi and Wu, Jiaye and Furukawa, Yasutaka},
title = {Neural Procedural Reconstruction for Residential Buildings},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deformable Pose Traversal Convolution for 3D Action and Gesture Recognition
Weng, Junwu and Liu, Mengyuan and Jiang, Xudong and Yuan, Junsong
[pdf]
[bibtex]
@InProceedings{Weng_2018_ECCV,
author = {Weng, Junwu and Liu, Mengyuan and Jiang, Xudong and Yuan, Junsong},
title = {Deformable Pose Traversal Convolution for 3D Action and Gesture Recognition},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

A Trilateral Weighted Sparse Coding Scheme for Real-World Image Denoising
Xu, Jun and Zhang, Lei and Zhang, David
[pdf]
[bibtex]
@InProceedings{Xu_2018_ECCV,
author = {Xu, Jun and Zhang, Lei and Zhang, David},
title = {A Trilateral Weighted Sparse Coding Scheme for Real-World Image Denoising},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Orthogonal Deep Features Decomposition for Age-Invariant Face Recognition
Wang, Yitong and Gong, Dihong and Zhou, Zheng and Ji, Xing and Wang, Hao and Li, Zhifeng and Liu, Wei and Zhang, Tong
[pdf]
[bibtex]
@InProceedings{Wang_2018_ECCV,
author = {Wang, Yitong and Gong, Dihong and Zhou, Zheng and Ji, Xing and Wang, Hao and Li, Zhifeng and Liu, Wei and Zhang, Tong},
title = {Orthogonal Deep Features Decomposition for Age-Invariant Face Recognition},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Pyramid Dilated Deeper ConvLSTM for Video Salient Object Detection
Song, Hongmei and Wang, Wenguan and Zhao, Sanyuan and Shen, Jianbing and Lam, Kin-Man
[pdf]
[bibtex]
@InProceedings{Song_2018_ECCV,
author = {Song, Hongmei and Wang, Wenguan and Zhao, Sanyuan and Shen, Jianbing and Lam, Kin-Man},
title = {Pyramid Dilated Deeper ConvLSTM for Video Salient Object Detection},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deep Burst Denoising
Godard, Clement and Matzen, Kevin and Uyttendaele, Matt
[pdf]
[bibtex]
@InProceedings{Godard_2018_ECCV,
author = {Godard, Clement and Matzen, Kevin and Uyttendaele, Matt},
title = {Deep Burst Denoising},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Learning to Separate Object Sounds by Watching Unlabeled Video
Gao, Ruohan and Feris, Rogerio and Grauman, Kristen
[pdf]
[bibtex]
@InProceedings{Gao_2018_ECCV,
author = {Gao, Ruohan and Feris, Rogerio and Grauman, Kristen},
title = {Learning to Separate Object Sounds by Watching Unlabeled Video},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Learnable PINs: Cross-Modal Embeddings for Person Identity
Nagrani, Arsha and Albanie, Samuel and Zisserman, Andrew
[pdf]
[bibtex]
@InProceedings{Nagrani_2018_ECCV,
author = {Nagrani, Arsha and Albanie, Samuel and Zisserman, Andrew},
title = {Learnable PINs: Cross-Modal Embeddings for Person Identity},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Multi-object Tracking with Neural Gating Using Bilinear LSTM
Kim, Chanho and Li, Fuxin and Rehg, James M.
[pdf]
[bibtex]
@InProceedings{Kim_2018_ECCV,
author = {Kim, Chanho and Li, Fuxin and Rehg, James M.},
title = {Multi-object Tracking with Neural Gating Using Bilinear LSTM},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Recovering Accurate 3D Human Pose in The Wild Using IMUs and a Moving Camera
von Marcard, Timo and Henschel, Roberto and Black, Michael J. and Rosenhahn, Bodo and Pons-Moll, Gerard
[pdf]
[bibtex]
@InProceedings{Marcard_2018_ECCV,
author = {von Marcard, Timo and Henschel, Roberto and Black, Michael J. and Rosenhahn, Bodo and Pons-Moll, Gerard},
title = {Recovering Accurate 3D Human Pose in The Wild Using IMUs and a Moving Camera},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Model Adaptation with Synthetic and Real Data for Semantic Dense Foggy Scene Understanding
Sakaridis, Christos and Dai, Dengxin and Hecker, Simon and Van Gool, Luc
[pdf]
[bibtex]
@InProceedings{Sakaridis_2018_ECCV,
author = {Sakaridis, Christos and Dai, Dengxin and Hecker, Simon and Van Gool, Luc},
title = {Model Adaptation with Synthetic and Real Data for Semantic Dense Foggy Scene Understanding},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

NetAdapt: Platform-Aware Neural Network Adaptation for Mobile Applications
Yang, Tien-Ju and Howard, Andrew and Chen, Bo and Zhang, Xiao and Go, Alec and Sandler, Mark and Sze, Vivienne and Adam, Hartwig
[pdf]
[bibtex]
@InProceedings{Yang_2018_ECCV,
author = {Yang, Tien-Ju and Howard, Andrew and Chen, Bo and Zhang, Xiao and Go, Alec and Sandler, Mark and Sze, Vivienne and Adam, Hartwig},
title = {NetAdapt: Platform-Aware Neural Network Adaptation for Mobile Applications},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

MT-VAE: Learning Motion Transformations to Generate Multimodal Human Dynamics
Yan, Xinchen and Rastogi, Akash and Villegas, Ruben and Sunkavalli, Kalyan and Shechtman, Eli and Hadap, Sunil and Yumer, Ersin and Lee, Honglak
[pdf]
[bibtex]
@InProceedings{Yan_2018_ECCV,
author = {Yan, Xinchen and Rastogi, Akash and Villegas, Ruben and Sunkavalli, Kalyan and Shechtman, Eli and Hadap, Sunil and Yumer, Ersin and Lee, Honglak},
title = {MT-VAE: Learning Motion Transformations to Generate Multimodal Human Dynamics},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Affine Correspondences between Central Cameras for Rapid Relative Pose Estimation
Eichhardt, Ivan and Chetverikov, Dmitry
[pdf]
[bibtex]
@InProceedings{Eichhardt_2018_ECCV,
author = {Eichhardt, Ivan and Chetverikov, Dmitry},
title = {Affine Correspondences between Central Cameras for Rapid Relative Pose Estimation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Lifting Layers: Analysis and Applications
Ochs, Peter and Meinhardt, Tim and Leal-Taixe, Laura and Moeller, Michael
[pdf]
[bibtex]
@InProceedings{Ochs_2018_ECCV,
author = {Ochs, Peter and Meinhardt, Tim and Leal-Taixe, Laura and Moeller, Michael},
title = {Lifting Layers: Analysis and Applications},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Remote Photoplethysmography Correspondence Feature for 3D Mask Face Presentation Attack Detection
Liu, Si-Qi and Lan, Xiangyuan and Yuen, Pong C.
[pdf]
[bibtex]
@InProceedings{Liu_2018_ECCV,
author = {Liu, Si-Qi and Lan, Xiangyuan and Yuen, Pong C.},
title = {Remote Photoplethysmography Correspondence Feature for 3D Mask Face Presentation Attack Detection},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Beyond Part Models: Person Retrieval with Refined Part Pooling (and A Strong Convolutional Baseline)
Sun, Yifan and Zheng, Liang and Yang, Yi and Tian, Qi and Wang, Shengjin
[pdf]
[bibtex]
@InProceedings{Sun_2018_ECCV,
author = {Sun, Yifan and Zheng, Liang and Yang, Yi and Tian, Qi and Wang, Shengjin},
title = {Beyond Part Models: Person Retrieval with Refined Part Pooling (and A Strong Convolutional Baseline)},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Generative Adversarial Network with Spatial Attention for Face Attribute Editing
Zhang, Gang and Kan, Meina and Shan, Shiguang and Chen, Xilin
[pdf]
[bibtex]
@InProceedings{Zhang_2018_ECCV,
author = {Zhang, Gang and Kan, Meina and Shan, Shiguang and Chen, Xilin},
title = {Generative Adversarial Network with Spatial Attention for Face Attribute Editing},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Pairwise Body-Part Attention for Recognizing Human-Object Interactions
Fang, Hao-Shu and Cao, Jinkun and Tai, Yu-Wing and Lu, Cewu
[pdf]
[bibtex]
@InProceedings{Fang_2018_ECCV,
author = {Fang, Hao-Shu and Cao, Jinkun and Tai, Yu-Wing and Lu, Cewu},
title = {Pairwise Body-Part Attention for Recognizing Human-Object Interactions},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Person Search via A Mask-guided Two-stream CNN Model
Chen, Di and Zhang, Shanshan and Ouyang, Wanli and Yang, Jian and Tai, Ying
[pdf]
[bibtex]
@InProceedings{Chen_2018_ECCV,
author = {Chen, Di and Zhang, Shanshan and Ouyang, Wanli and Yang, Jian and Tai, Ying},
title = {Person Search via A Mask-guided Two-stream CNN Model},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Saliency Benchmarking Made Easy: Separating Models, Maps and Metrics
Kummerer, Matthias and Wallis, Thomas S. A. and Bethge, Matthias
[pdf]
[bibtex]
@InProceedings{Kummerer_2018_ECCV,
author = {Kummerer, Matthias and Wallis, Thomas S. A. and Bethge, Matthias},
title = {Saliency Benchmarking Made Easy: Separating Models, Maps and Metrics},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Sub-GAN: An Unsupervised Generative Model via Subspaces
Liang, Jie and Yang, Jufeng and Lee, Hsin-Ying and Wang, Kai and Yang, Ming-Hsuan
[pdf]
[bibtex]
@InProceedings{Liang_2018_ECCV,
author = {Liang, Jie and Yang, Jufeng and Lee, Hsin-Ying and Wang, Kai and Yang, Ming-Hsuan},
title = {Sub-GAN: An Unsupervised Generative Model via Subspaces},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Improving Spatiotemporal Self-Supervision by Deep Reinforcement Learning
Buchler, Uta and Brattoli, Biagio and Ommer, Bjorn
[pdf]
[bibtex]
@InProceedings{Buchler_2018_ECCV,
author = {Buchler, Uta and Brattoli, Biagio and Ommer, Bjorn},
title = {Improving Spatiotemporal Self-Supervision by Deep Reinforcement Learning},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

The Unmanned Aerial Vehicle Benchmark: Object Detection and Tracking
Du, Dawei and Qi, Yuankai and Yu, Hongyang and Yang, Yifan and Duan, Kaiwen and Li, Guorong and Zhang, Weigang and Huang, Qingming and Tian, Qi
[pdf]
[bibtex]
@InProceedings{Du_2018_ECCV,
author = {Du, Dawei and Qi, Yuankai and Yu, Hongyang and Yang, Yifan and Duan, Kaiwen and Li, Guorong and Zhang, Weigang and Huang, Qingming and Tian, Qi},
title = {The Unmanned Aerial Vehicle Benchmark: Object Detection and Tracking},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Beyond local reasoning for stereo confidence estimation with deep learning
Tosi, Fabio and Poggi, Matteo and Benincasa, Antonio and Mattoccia, Stefano
[pdf]
[bibtex]
@InProceedings{Tosi_2018_ECCV,
author = {Tosi, Fabio and Poggi, Matteo and Benincasa, Antonio and Mattoccia, Stefano},
title = {Beyond local reasoning for stereo confidence estimation with deep learning},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

ConvNets and ImageNet Beyond Accuracy: Understanding Mistakes and Uncovering Biases
Stock, Pierre and Cisse, Moustapha
[pdf]
[bibtex]
@InProceedings{Stock_2018_ECCV,
author = {Stock, Pierre and Cisse, Moustapha},
title = {ConvNets and ImageNet Beyond Accuracy: Understanding Mistakes and Uncovering Biases},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Making Deep Heatmaps Robust to Partial Occlusions for 3D Object Pose Estimation
Oberweger, Markus and Rad, Mahdi and Lepetit, Vincent
[pdf]
[bibtex]
@InProceedings{Oberweger_2018_ECCV,
author = {Oberweger, Markus and Rad, Mahdi and Lepetit, Vincent},
title = {Making Deep Heatmaps Robust to Partial Occlusions for 3D Object Pose Estimation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

CBAM: Convolutional Block Attention Module
Woo, Sanghyun and Park, Jongchan and Lee, Joon-Young and So Kweon, In
[pdf]
[bibtex]
@InProceedings{Woo_2018_ECCV,
author = {Woo, Sanghyun and Park, Jongchan and Lee, Joon-Young and So Kweon, In},
title = {CBAM: Convolutional Block Attention Module},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Spatio-temporal Transformer Network for Video Restoration
Hyun Kim, Tae and Sajjadi, Mehdi S. M. and Hirsch, Michael and Scholkopf, Bernhard
[pdf]
[bibtex]
@InProceedings{Kim_2018_ECCV,
author = {Hyun Kim, Tae and Sajjadi, Mehdi S. M. and Hirsch, Michael and Scholkopf, Bernhard},
title = {Spatio-temporal Transformer Network for Video Restoration},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

stagNet: An Attentive Semantic RNN for Group Activity Recognition
Qi, Mengshi and Qin, Jie and Li, Annan and Wang, Yunhong and Luo, Jiebo and Van Gool, Luc
[pdf]
[bibtex]
@InProceedings{Qi_2018_ECCV,
author = {Qi, Mengshi and Qin, Jie and Li, Annan and Wang, Yunhong and Luo, Jiebo and Van Gool, Luc},
title = {stagNet: An Attentive Semantic RNN for Group Activity Recognition},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Learning Discriminative Video Representations Using Adversarial Perturbations
Wang, Jue and Cherian, Anoop
[pdf]
[bibtex]
@InProceedings{Wang_2018_ECCV,
author = {Wang, Jue and Cherian, Anoop},
title = {Learning Discriminative Video Representations Using Adversarial Perturbations},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

On Offline Evaluation of Vision-based Driving Models
Codevilla, Felipe and Lopez, Antonio M. and Koltun, Vladlen and Dosovitskiy, Alexey
[pdf]
[bibtex]
@InProceedings{Codevilla_2018_ECCV,
author = {Codevilla, Felipe and Lopez, Antonio M. and Koltun, Vladlen and Dosovitskiy, Alexey},
title = {On Offline Evaluation of Vision-based Driving Models},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Real-to-Virtual Domain Unification for End-to-End Autonomous Driving
Yang, Luona and Liang, Xiaodan and Wang, Tairui and Xing, Eric
[pdf]
[bibtex]
@InProceedings{Yang_2018_ECCV,
author = {Yang, Luona and Liang, Xiaodan and Wang, Tairui and Xing, Eric},
title = {Real-to-Virtual Domain Unification for End-to-End Autonomous Driving},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

How Local is the Local Diversity? Reinforcing Sequential Determinantal Point Processes with Dynamic Ground Sets for Supervised Video Summarization
Li, Yandong and Wang, Liqiang and Yang, Tianbao and Gong, Boqing
[pdf]
[bibtex]
@InProceedings{Li_2018_ECCV,
author = {Li, Yandong and Wang, Liqiang and Yang, Tianbao and Gong, Boqing},
title = {How Local is the Local Diversity? Reinforcing Sequential Determinantal Point Processes with Dynamic Ground Sets for Supervised Video Summarization},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Piggyback: Adapting a Single Network to Multiple Tasks by Learning to Mask Weights
Mallya, Arun and Davis, Dillon and Lazebnik, Svetlana
[pdf]
[bibtex]
@InProceedings{Mallya_2018_ECCV,
author = {Mallya, Arun and Davis, Dillon and Lazebnik, Svetlana},
title = {Piggyback: Adapting a Single Network to Multiple Tasks by Learning to Mask Weights},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

PSANet: Point-wise Spatial Attention Network for Scene Parsing
Zhao, Hengshuang and Zhang, Yi and Liu, Shu and Shi, Jianping and Change Loy, Chen and Lin, Dahua and Jia, Jiaya
[pdf]
[bibtex]
@InProceedings{Zhao_2018_ECCV,
author = {Zhao, Hengshuang and Zhang, Yi and Liu, Shu and Shi, Jianping and Change Loy, Chen and Lin, Dahua and Jia, Jiaya},
title = {PSANet: Point-wise Spatial Attention Network for Scene Parsing},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

X-ray Computed Tomography Through Scatter
Geva, Adam and Schechner, Yoav Y. and Chernyak, Yonatan and Gupta, Rajiv
[pdf]
[bibtex]
@InProceedings{Geva_2018_ECCV,
author = {Geva, Adam and Schechner, Yoav Y. and Chernyak, Yonatan and Gupta, Rajiv},
title = {X-ray Computed Tomography Through Scatter},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Image Generation from Sketch Constraint Using Contextual GAN
Lu, Yongyi and Wu, Shangzhe and Tai, Yu-Wing and Tang, Chi-Keung
[pdf]
[bibtex]
@InProceedings{Lu_2018_ECCV,
author = {Lu, Yongyi and Wu, Shangzhe and Tai, Yu-Wing and Tang, Chi-Keung},
title = {Image Generation from Sketch Constraint Using Contextual GAN},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Weakly-supervised 3D Hand Pose Estimation from Monocular RGB Images
Cai, Yujun and Ge, Liuhao and Cai, Jianfei and Yuan, Junsong
[pdf]
[bibtex]
@InProceedings{Cai_2018_ECCV,
author = {Cai, Yujun and Ge, Liuhao and Cai, Jianfei and Yuan, Junsong},
title = {Weakly-supervised 3D Hand Pose Estimation from Monocular RGB Images},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

SkipNet: Learning Dynamic Routing in Convolutional Networks
Wang, Xin and Yu, Fisher and Dou, Zi-Yi and Darrell, Trevor and Gonzalez, Joseph E.
[pdf]
[bibtex]
@InProceedings{Wang_2018_ECCV,
author = {Wang, Xin and Yu, Fisher and Dou, Zi-Yi and Darrell, Trevor and Gonzalez, Joseph E.},
title = {SkipNet: Learning Dynamic Routing in Convolutional Networks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Point-to-Point Regression PointNet for 3D Hand Pose Estimation
Ge, Liuhao and Ren, Zhou and Yuan, Junsong
[pdf]
[bibtex]
@InProceedings{Ge_2018_ECCV,
author = {Ge, Liuhao and Ren, Zhou and Yuan, Junsong},
title = {Point-to-Point Regression PointNet for 3D Hand Pose Estimation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deeply Learned Compositional Models for Human Pose Estimation
Tang, Wei and Yu, Pei and Wu, Ying
[pdf]
[bibtex]
@InProceedings{Tang_2018_ECCV,
author = {Tang, Wei and Yu, Pei and Wu, Ying},
title = {Deeply Learned Compositional Models for Human Pose Estimation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Compound Memory Networks for Few-shot Video Classification
Zhu, Linchao and Yang, Yi
[pdf]
[bibtex]
@InProceedings{Zhu_2018_ECCV,
author = {Zhu, Linchao and Yang, Yi},
title = {Compound Memory Networks for Few-shot Video Classification},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

3D Recurrent Neural Networks with Context Fusion for Point Cloud Semantic Segmentation
Ye, Xiaoqing and Li, Jiamao and Huang, Hexiao and Du, Liang and Zhang, Xiaolin
[pdf]
[bibtex]
@InProceedings{Ye_2018_ECCV,
author = {Ye, Xiaoqing and Li, Jiamao and Huang, Hexiao and Du, Liang and Zhang, Xiaolin},
title = {3D Recurrent Neural Networks with Context Fusion for Point Cloud Semantic Segmentation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Unsupervised Person Re-identification by Deep Learning Tracklet Association
Li, Minxian and Zhu, Xiatian and Gong, Shaogang
[pdf]
[bibtex]
@InProceedings{Li_2018_ECCV,
author = {Li, Minxian and Zhu, Xiatian and Gong, Shaogang},
title = {Unsupervised Person Re-identification by Deep Learning Tracklet Association},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deep Boosting for Image Denoising
Chen, Chang and Xiong, Zhiwei and Tian, Xinmei and Wu, Feng
[pdf]
[bibtex]
@InProceedings{Chen_2018_ECCV,
author = {Chen, Chang and Xiong, Zhiwei and Tian, Xinmei and Wu, Feng},
title = {Deep Boosting for Image Denoising},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

The Contextual Loss for Image Transformation with Non-Aligned Data
Mechrez, Roey and Talmi, Itamar and Zelnik-Manor, Lihi
[pdf]
[bibtex]
@InProceedings{Mechrez_2018_ECCV,
author = {Mechrez, Roey and Talmi, Itamar and Zelnik-Manor, Lihi},
title = {The Contextual Loss for Image Transformation with Non-Aligned Data},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Actor-centric Relation Network
Sun, Chen and Shrivastava, Abhinav and Vondrick, Carl and Murphy, Kevin and Sukthankar, Rahul and Schmid, Cordelia
[pdf]
[bibtex]
@InProceedings{Sun_2018_ECCV,
author = {Sun, Chen and Shrivastava, Abhinav and Vondrick, Carl and Murphy, Kevin and Sukthankar, Rahul and Schmid, Cordelia},
title = {Actor-centric Relation Network },
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Fully-Convolutional Point Networks for Large-Scale Point Clouds
Rethage, Dario and Wald, Johanna and Sturm, Jurgen and Navab, Nassir and Tombari, Federico
[pdf]
[bibtex]
@InProceedings{Rethage_2018_ECCV,
author = {Rethage, Dario and Wald, Johanna and Sturm, Jurgen and Navab, Nassir and Tombari, Federico},
title = {Fully-Convolutional Point Networks for Large-Scale Point Clouds},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Joint optimization for compressive video sensing and reconstruction under hardware constraints
Yoshida, Michitaka and Torii, Akihiko and Okutomi, Masatoshi and Endo, Kenta and Sugiyama, Yukinobu and Taniguchi, Rin-ichiro and Nagahara, Hajime
[pdf]
[bibtex]
@InProceedings{Yoshida_2018_ECCV,
author = {Yoshida, Michitaka and Torii, Akihiko and Okutomi, Masatoshi and Endo, Kenta and Sugiyama, Yukinobu and Taniguchi, Rin-ichiro and Nagahara, Hajime},
title = {Joint optimization for compressive video sensing and reconstruction under hardware constraints},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Improved Structure from Motion Using Fiducial Marker Matching
DeGol, Joseph and Bretl, Timothy and Hoiem, Derek
[pdf]
[bibtex]
@InProceedings{DeGol_2018_ECCV,
author = {DeGol, Joseph and Bretl, Timothy and Hoiem, Derek},
title = {Improved Structure from Motion Using Fiducial Marker Matching},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deep Autoencoder for Combined Human Pose Estimation and Body Model Upscaling
Trumble, Matthew and Gilbert, Andrew and Hilton, Adrian and Collomosse, John
[pdf]
[bibtex]
@InProceedings{Trumble_2018_ECCV,
author = {Trumble, Matthew and Gilbert, Andrew and Hilton, Adrian and Collomosse, John},
title = {Deep Autoencoder for Combined Human Pose Estimation and Body Model Upscaling},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Integral Human Pose Regression
Sun, Xiao and Xiao, Bin and Wei, Fangyin and Liang, Shuang and Wei, Yichen
[pdf]
[bibtex]
@InProceedings{Sun_2018_ECCV,
author = {Sun, Xiao and Xiao, Bin and Wei, Fangyin and Liang, Shuang and Wei, Yichen},
title = {Integral Human Pose Regression},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Convolutional Networks with Adaptive Inference Graphs
Veit, Andreas and Belongie, Serge
[pdf]
[bibtex]
@InProceedings{Veit_2018_ECCV,
author = {Veit, Andreas and Belongie, Serge},
title = {Convolutional Networks with Adaptive Inference Graphs},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

A Dataset and Architecture for Visual Reasoning with a Working Memory
Robert Yang, Guangyu and Ganichev, Igor and Wang, Xiao-Jing and Shlens, Jonathon and Sussillo, David
[pdf]
[bibtex]
@InProceedings{Yang_2018_ECCV,
author = {Robert Yang, Guangyu and Ganichev, Igor and Wang, Xiao-Jing and Shlens, Jonathon and Sussillo, David},
title = {A Dataset and Architecture for Visual Reasoning with a Working Memory},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Video Compression through Image Interpolation
Wu, Chao-Yuan and Singhal, Nayan and Krahenbuhl, Philipp
[pdf]
[bibtex]
@InProceedings{Wu_2018_ECCV,
author = {Wu, Chao-Yuan and Singhal, Nayan and Krahenbuhl, Philipp},
title = {Video Compression through Image Interpolation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Composition Loss for Counting, Density Map Estimation and Localization in Dense Crowds
Idrees, Haroon and Tayyab, Muhmmad and Athrey, Kishan and Zhang, Dong and Al-Maadeed, Somaya and Rajpoot, Nasir and Shah, Mubarak
[pdf]
[bibtex]
@InProceedings{Idrees_2018_ECCV,
author = {Idrees, Haroon and Tayyab, Muhmmad and Athrey, Kishan and Zhang, Dong and Al-Maadeed, Somaya and Rajpoot, Nasir and Shah, Mubarak},
title = {Composition Loss for Counting, Density Map Estimation and Localization in Dense Crowds},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Affinity Derivation and Graph Merge for Instance Segmentation
Liu, Yiding and Yang, Siyu and Li, Bin and Zhou, Wengang and Xu, Jizheng and Li, Houqiang and Lu, Yan
[pdf]
[bibtex]
@InProceedings{Liu_2018_ECCV,
author = {Liu, Yiding and Yang, Siyu and Li, Bin and Zhou, Wengang and Xu, Jizheng and Li, Houqiang and Lu, Yan},
title = {Affinity Derivation and Graph Merge for Instance Segmentation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Progressive Structure from Motion
Locher, Alex and Havlena, Michal and Van Gool, Luc
[pdf]
[bibtex]
@InProceedings{Locher_2018_ECCV,
author = {Locher, Alex and Havlena, Michal and Van Gool, Luc},
title = {Progressive Structure from Motion},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

MultiPoseNet: Fast Multi-Person Pose Estimation using Pose Residual Network
Kocabas, Muhammed and Karagoz, Salih and Akbas, Emre
[pdf]
[bibtex]
@InProceedings{Kocabas_2018_ECCV,
author = {Kocabas, Muhammed and Karagoz, Salih and Akbas, Emre},
title = {MultiPoseNet: Fast Multi-Person Pose Estimation using Pose Residual Network},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Self-Calibrating Isometric Non-Rigid Structure-from-Motion
Parashar, Shaifali and Bartoli, Adrien and Pizarro, Daniel
[pdf]
[bibtex]
@InProceedings{Parashar_2018_ECCV,
author = {Parashar, Shaifali and Bartoli, Adrien and Pizarro, Daniel},
title = {Self-Calibrating Isometric Non-Rigid Structure-from-Motion},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Using Object Information for Spotting Text
Prasad, Shitala and Wai Kin Kong, Adams
[pdf]
[bibtex]
@InProceedings{Prasad_2018_ECCV,
author = {Prasad, Shitala and Wai Kin Kong, Adams},
title = {Using Object Information for Spotting Text},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Modality Distillation with Multiple Stream Networks for Action Recognition
Garcia, Nuno C. and Morerio, Pietro and Murino, Vittorio
[pdf]
[bibtex]
@InProceedings{Garcia_2018_ECCV,
author = {Garcia, Nuno C. and Morerio, Pietro and Murino, Vittorio},
title = {Modality Distillation with Multiple Stream Networks for Action Recognition},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Stereo Vision-based Semantic 3D Object and Ego-motion Tracking for Autonomous Driving
Li, Peiliang and Qin, Tong and Shen, andShaojie
[pdf]
[bibtex]
@InProceedings{Li_2018_ECCV,
author = {Li, Peiliang and Qin, Tong and Shen, andShaojie},
title = {Stereo Vision-based Semantic 3D Object and Ego-motion Tracking for Autonomous Driving},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

AutoLoc: Weakly-supervised Temporal Action Localization in Untrimmed Videos
Shou, Zheng and Gao, Hang and Zhang, Lei and Miyazawa, Kazuyuki and Chang, Shih-Fu
[pdf]
[bibtex]
@InProceedings{Shou_2018_ECCV,
author = {Shou, Zheng and Gao, Hang and Zhang, Lei and Miyazawa, Kazuyuki and Chang, Shih-Fu},
title = {AutoLoc: Weakly-supervised Temporal Action Localization in Untrimmed Videos},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Unsupervised Domain Adaptation for 3D Keypoint Estimation via View Consistency
Zhou, Xingyi and Karpur, Arjun and Gan, Chuang and Luo, Linjie and Huang, Qixing
[pdf]
[bibtex]
@InProceedings{Zhou_2018_ECCV,
author = {Zhou, Xingyi and Karpur, Arjun and Gan, Chuang and Luo, Linjie and Huang, Qixing},
title = {Unsupervised Domain Adaptation for 3D Keypoint Estimation via View Consistency},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Visual-Inertial Object Detection and Mapping
Fei, Xiaohan and Soatto, Stefano
[pdf]
[bibtex]
@InProceedings{Fei_2018_ECCV,
author = {Fei, Xiaohan and Soatto, Stefano},
title = {Visual-Inertial Object Detection and Mapping},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

FishEyeRecNet: A Multi-Context Collaborative Deep Network for Fisheye Image Rectification
Yin, Xiaoqing and Wang, Xinchao and Yu, Jun and Zhang, Maojun and Fua, Pascal and Tao, Dacheng
[pdf]
[bibtex]
@InProceedings{Yin_2018_ECCV,
author = {Yin, Xiaoqing and Wang, Xinchao and Yu, Jun and Zhang, Maojun and Fua, Pascal and Tao, Dacheng},
title = {FishEyeRecNet: A Multi-Context Collaborative Deep Network for Fisheye Image Rectification},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Semi-supervised FusedGAN for Conditional Image Generation
Bodla, Navaneeth and Hua, Gang and Chellappa, Rama
[pdf]
[bibtex]
@InProceedings{Bodla_2018_ECCV,
author = {Bodla, Navaneeth and Hua, Gang and Chellappa, Rama},
title = {Semi-supervised FusedGAN for Conditional Image Generation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Group Normalization
Wu, Yuxin and He, Kaiming
[pdf]
[bibtex]
@InProceedings{Wu_2018_ECCV,
author = {Wu, Yuxin and He, Kaiming},
title = {Group Normalization},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Conditional Image-Text Embedding Networks
Plummer, Bryan A. and Kordas, Paige and Hadi Kiapour, M. and Zheng, Shuai and Piramuthu, Robinson and Lazebnik, Svetlana
[pdf]
[bibtex]
@InProceedings{Plummer_2018_ECCV,
author = {Plummer, Bryan A. and Kordas, Paige and Hadi Kiapour, M. and Zheng, Shuai and Piramuthu, Robinson and Lazebnik, Svetlana},
title = {Conditional Image-Text Embedding Networks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deep Co-Training for Semi-Supervised Image Recognition
Qiao, Siyuan and Shen, Wei and Zhang, Zhishuai and Wang, Bo and Yuille, Alan
[pdf]
[bibtex]
@InProceedings{Qiao_2018_ECCV,
author = {Qiao, Siyuan and Shen, Wei and Zhang, Zhishuai and Wang, Bo and Yuille, Alan},
title = {Deep Co-Training for Semi-Supervised Image Recognition},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Object Level Visual Reasoning in Videos
Baradel, Fabien and Neverova, Natalia and Wolf, Christian and Mille, Julien and Mori, Greg
[pdf]
[bibtex]
@InProceedings{Baradel_2018_ECCV,
author = {Baradel, Fabien and Neverova, Natalia and Wolf, Christian and Mille, Julien and Mori, Greg},
title = {Object Level Visual Reasoning in Videos},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

In the Eye of Beholder: Joint Learning of Gaze and Actions in First Person Video
Li, Yin and Liu, Miao and Rehg, James M.
[pdf]
[bibtex]
@InProceedings{Li_2018_ECCV,
author = {Li, Yin and Liu, Miao and Rehg, James M.},
title = {In the Eye of Beholder: Joint Learning of Gaze and Actions in First Person Video},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deep Factorised Inverse-Sketching
Pang, Kaiyue and Li, Da and Song, Jifei and Song, Yi-Zhe and Xiang, Tao and Hospedales, Timothy M.
[pdf]
[bibtex]
@InProceedings{Pang_2018_ECCV,
author = {Pang, Kaiyue and Li, Da and Song, Jifei and Song, Yi-Zhe and Xiang, Tao and Hospedales, Timothy M.},
title = {Deep Factorised Inverse-Sketching},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

A Joint Sequence Fusion Model for Video Question Answering and Retrieval
Yu, Youngjae and Kim, Jongseok and Kim, Gunhee
[pdf]
[bibtex]
@InProceedings{Yu_2018_ECCV,
author = {Yu, Youngjae and Kim, Jongseok and Kim, Gunhee},
title = {A Joint Sequence Fusion Model for Video Question Answering and Retrieval},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

View-graph Selection Framework for SfM
Shah, Rajvi and Chari, Visesh and J Narayanan, P
[pdf]
[bibtex]
@InProceedings{Shah_2018_ECCV,
author = {Shah, Rajvi and Chari, Visesh and J Narayanan, P},
title = {View-graph Selection Framework for SfM},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Synthetically Supervised Feature Learning for Scene Text Recognition
Liu, Yang and Wang, Zhaowen and Jin, Hailin and Wassell, Ian
[pdf]
[bibtex]
@InProceedings{Liu_2018_ECCV,
author = {Liu, Yang and Wang, Zhaowen and Jin, Hailin and Wassell, Ian},
title = {Synthetically Supervised Feature Learning for Scene Text Recognition},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deep Clustering for Unsupervised Learning of Visual Features
Caron, Mathilde and Bojanowski, Piotr and Joulin, Armand and Douze, Matthijs
[pdf]
[bibtex]
@InProceedings{Caron_2018_ECCV,
author = {Caron, Mathilde and Bojanowski, Piotr and Joulin, Armand and Douze, Matthijs},
title = {Deep Clustering for Unsupervised Learning of Visual Features},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Is Robustness the Cost of Accuracy? -- A Comprehensive Study on the Robustness of 18 Deep Image Classification Models
Su, Dong and Zhang, Huan and Chen, Hongge and Yi, Jinfeng and Chen, Pin-Yu and Gao, Yupeng
[pdf]
[bibtex]
@InProceedings{Su_2018_ECCV,
author = {Su, Dong and Zhang, Huan and Chen, Hongge and Yi, Jinfeng and Chen, Pin-Yu and Gao, Yupeng},
title = {Is Robustness the Cost of Accuracy? -- A Comprehensive Study on the Robustness of 18 Deep Image Classification Models},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Lifelong Learning via Progressive Distillation and Retrospection
Hou, Saihui and Pan, Xinyu and Change Loy, Chen and Wang, Zilei and Lin, Dahua
[pdf]
[bibtex]
@InProceedings{Hou_2018_ECCV,
author = {Hou, Saihui and Pan, Xinyu and Change Loy, Chen and Wang, Zilei and Lin, Dahua},
title = {Lifelong Learning via Progressive Distillation and Retrospection},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Urban Zoning Using Higher-Order Markov Random Fields on Multi-View Imagery Data
Feng, Tian and Truong, Quang-Trung and Thanh Nguyen, Duc and Yu Koh, Jing and Yu, Lap-Fai and Binder, Alexander and Yeung, Sai-Kit
[pdf]
[bibtex]
@InProceedings{Feng_2018_ECCV,
author = {Feng, Tian and Truong, Quang-Trung and Thanh Nguyen, Duc and Yu Koh, Jing and Yu, Lap-Fai and Binder, Alexander and Yeung, Sai-Kit},
title = {Urban Zoning Using Higher-Order Markov Random Fields on Multi-View Imagery Data},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Progressive Neural Architecture Search
Liu, Chenxi and Zoph, Barret and Neumann, Maxim and Shlens, Jonathon and Hua, Wei and Li, Li-Jia and Fei-Fei, Li and Yuille, Alan and Huang, Jonathan and Murphy, Kevin
[pdf]
[bibtex]
@InProceedings{Liu_2018_ECCV,
author = {Liu, Chenxi and Zoph, Barret and Neumann, Maxim and Shlens, Jonathon and Hua, Wei and Li, Li-Jia and Fei-Fei, Li and Yuille, Alan and Huang, Jonathan and Murphy, Kevin},
title = {Progressive Neural Architecture Search},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Single Image Water Hazard Detection using FCN with Reflection Attention Units
Han, Xiaofeng and Nguyen, Chuong and You, Shaodi and Lu, Jianfeng
[pdf]
[bibtex]
@InProceedings{Han_2018_ECCV,
author = {Han, Xiaofeng and Nguyen, Chuong and You, Shaodi and Lu, Jianfeng},
title = {Single Image Water Hazard Detection using FCN with Reflection Attention Units},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Predicting Gaze in Egocentric Video by Learning Task-dependent Attention Transition
Huang, Yifei and Cai, Minjie and Li, Zhenqiang and Sato, Yoichi
[pdf]
[bibtex]
@InProceedings{Huang_2018_ECCV,
author = {Huang, Yifei and Cai, Minjie and Li, Zhenqiang and Sato, Yoichi},
title = {Predicting Gaze in Egocentric Video by Learning Task-dependent Attention Transition},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Joint Learning of Intrinsic Images and Semantic Segmentation
Baslamisli, Anil S. and Groenestege, Thomas T. and Das, Partha and Le, Hoang-An and Karaoglu, Sezer and Gevers, Theo
[pdf]
[bibtex]
@InProceedings{Baslamisli_2018_ECCV,
author = {Baslamisli, Anil S. and Groenestege, Thomas T. and Das, Partha and Le, Hoang-An and Karaoglu, Sezer and Gevers, Theo},
title = {Joint Learning of Intrinsic Images and Semantic Segmentation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Towards Robust Neural Networks via Random Self-ensemble
Liu, Xuanqing and Cheng, Minhao and Zhang, Huan and Hsieh, Cho-Jui
[pdf]
[bibtex]
@InProceedings{Liu_2018_ECCV,
author = {Liu, Xuanqing and Cheng, Minhao and Zhang, Huan and Hsieh, Cho-Jui},
title = {Towards Robust Neural Networks via Random Self-ensemble},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Programmable Triangulation Light Curtains
Wang, Jian and Bartels, Joseph and Whittaker, William and Sankaranarayanan, Aswin C. and Narasimhan, Srinivasa G.
[pdf]
[bibtex]
@InProceedings{Wang_2018_ECCV,
author = {Wang, Jian and Bartels, Joseph and Whittaker, William and Sankaranarayanan, Aswin C. and Narasimhan, Srinivasa G.},
title = {Programmable Triangulation Light Curtains},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Find and Focus: Retrieve and Localize Video Events with Natural Language Queries
Shao, Dian and Xiong, Yu and Zhao, Yue and Huang, Qingqiu and Qiao, Yu and Lin, Dahua
[pdf]
[bibtex]
@InProceedings{Shao_2018_ECCV,
author = {Shao, Dian and Xiong, Yu and Zhao, Yue and Huang, Qingqiu and Qiao, Yu and Lin, Dahua},
title = {Find and Focus: Retrieve and Localize Video Events with Natural Language Queries},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Rethinking the Form of Latent States in Image Captioning
Dai, Bo and Ye, Deming and Lin, Dahua
[pdf]
[bibtex]
@InProceedings{Dai_2018_ECCV,
author = {Dai, Bo and Ye, Deming and Lin, Dahua},
title = {Rethinking the Form of Latent States in Image Captioning},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

CubeNet: Equivariance to 3D Rotation and Translation
Worrall, Daniel and Brostow, Gabriel
[pdf]
[bibtex]
@InProceedings{Worrall_2018_ECCV,
author = {Worrall, Daniel and Brostow, Gabriel},
title = {CubeNet: Equivariance to 3D Rotation and Translation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

DeepWrinkles: Accurate and Realistic Clothing Modeling
Lahner, Zorah and Cremers, Daniel and Tung, Tony
[pdf]
[bibtex]
@InProceedings{Lahner_2018_ECCV,
author = {Lahner, Zorah and Cremers, Daniel and Tung, Tony},
title = {DeepWrinkles: Accurate and Realistic Clothing Modeling},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Bidirectional Feature Pyramid Network with Recurrent Attention Residual Modules for Shadow Detection
Zhu, Lei and Deng, Zijun and Hu, Xiaowei and Fu, Chi-Wing and Xu, Xuemiao and Qin, Jing and Heng, Pheng-Ann
[pdf]
[bibtex]
@InProceedings{Zhu_2018_ECCV,
author = {Zhu, Lei and Deng, Zijun and Hu, Xiaowei and Fu, Chi-Wing and Xu, Xuemiao and Qin, Jing and Heng, Pheng-Ann},
title = {Bidirectional Feature Pyramid Network with Recurrent Attention Residual Modules for Shadow Detection},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deep Regression Tracking with Shrinkage Loss
Lu, Xiankai and Ma, Chao and Ni, Bingbing and Yang, Xiaokang and Reid, Ian and Yang, Ming-Hsuan
[pdf]
[bibtex]
@InProceedings{Lu_2018_ECCV,
author = {Lu, Xiankai and Ma, Chao and Ni, Bingbing and Yang, Xiaokang and Reid, Ian and Yang, Ming-Hsuan},
title = {Deep Regression Tracking with Shrinkage Loss},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Super-Resolution and Sparse View CT Reconstruction
Zang, Guangming and Aly, Mohamed and Idoughi, Ramzi and Wonka, Peter and Heidrich, Wolfgang
[pdf]
[bibtex]
@InProceedings{Zang_2018_ECCV,
author = {Zang, Guangming and Aly, Mohamed and Idoughi, Ramzi and Wonka, Peter and Heidrich, Wolfgang},
title = {Super-Resolution and Sparse View CT Reconstruction},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes
Lyu, Pengyuan and Liao, Minghui and Yao, Cong and Wu, Wenhao and Bai, Xiang
[pdf]
[bibtex]
@InProceedings{Lyu_2018_ECCV,
author = {Lyu, Pengyuan and Liao, Minghui and Yao, Cong and Wu, Wenhao and Bai, Xiang},
title = {Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Learning to Dodge A Bullet: Concyclic View Morphing via Deep Learning
Jin, Shi and Liu, Ruiynag and Ji, Yu and Ye, Jinwei and Yu, Jingyi
[pdf]
[bibtex]
@InProceedings{Jin_2018_ECCV,
author = {Jin, Shi and Liu, Ruiynag and Ji, Yu and Ye, Jinwei and Yu, Jingyi},
title = {Learning to Dodge A Bullet: Concyclic View Morphing via Deep Learning},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deterministic Consensus Maximization with Biconvex Programming
Cai, Zhipeng and Chin, Tat-Jun and Le, Huu and Suter, David
[pdf]
[bibtex]
@InProceedings{Cai_2018_ECCV,
author = {Cai, Zhipeng and Chin, Tat-Jun and Le, Huu and Suter, David},
title = {Deterministic Consensus Maximization with Biconvex Programming},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Practical Black-box Attacks on Deep Neural Networks using Efficient Query Mechanisms
Nitin Bhagoji, Arjun and He, Warren and Li, Bo and Song, Dawn
[pdf]
[bibtex]
@InProceedings{Bhagoji_2018_ECCV,
author = {Nitin Bhagoji, Arjun and He, Warren and Li, Bo and Song, Dawn},
title = {Practical Black-box Attacks on Deep Neural Networks using Efficient Query Mechanisms},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Propagating LSTM: 3D Pose Estimation based on Joint Interdependency
Lee, Kyoungoh and Lee, Inwoong and Lee, Sanghoon
[pdf]
[bibtex]
@InProceedings{Lee_2018_ECCV,
author = {Lee, Kyoungoh and Lee, Inwoong and Lee, Sanghoon},
title = {Propagating LSTM: 3D Pose Estimation based on Joint Interdependency},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Associating Inter-Image Salient Instances for Weakly Supervised Semantic Segmentation
Fan, Ruochen and Hou, Qibin and Cheng, Ming-Ming and Yu, Gang and Martin, Ralph R. and Hu, Shi-Min
[pdf]
[bibtex]
@InProceedings{Fan_2018_ECCV,
author = {Fan, Ruochen and Hou, Qibin and Cheng, Ming-Ming and Yu, Gang and Martin, Ralph R. and Hu, Shi-Min},
title = {Associating Inter-Image Salient Instances for Weakly Supervised Semantic Segmentation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Jointly Discovering Visual Objects and Spoken Words from Raw Sensory Input
Harwath, David and Recasens, Adria and Suris, Didac and Chuang, Galen and Torralba, Antonio and Glass, James
[pdf]
[bibtex]
@InProceedings{Harwath_2018_ECCV,
author = {Harwath, David and Recasens, Adria and Suris, Didac and Chuang, Galen and Torralba, Antonio and Glass, James},
title = {Jointly Discovering Visual Objects and Spoken Words from Raw Sensory Input},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Visual Tracking via Spatially Aligned Correlation Filters Network
Zhang, Mengdan and Wang, Qiang and Xing, Junliang and Gao, Jin and Peng, Peixi and Hu, Weiming and Maybank, Steve
[pdf]
[bibtex]
@InProceedings{Zhang_2018_ECCV,
author = {Zhang, Mengdan and Wang, Qiang and Xing, Junliang and Gao, Jin and Peng, Peixi and Hu, Weiming and Maybank, Steve},
title = {Visual Tracking via Spatially Aligned Correlation Filters Network},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

HairNet: Single-View Hair Reconstruction using Convolutional Neural Networks
Zhou, Yi and Hu, Liwen and Xing, Jun and Chen, Weikai and Kung, Han-Wei and Tong, Xin and Li, Hao
[pdf]
[bibtex]
@InProceedings{Zhou_2018_ECCV,
author = {Zhou, Yi and Hu, Liwen and Xing, Jun and Chen, Weikai and Kung, Han-Wei and Tong, Xin and Li, Hao},
title = {HairNet: Single-View Hair Reconstruction using Convolutional Neural Networks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

The Sound of Pixels
Zhao, Hang and Gan, Chuang and Rouditchenko, Andrew and Vondrick, Carl and McDermott, Josh and Torralba, Antonio
[pdf]
[bibtex]
@InProceedings{Zhao_2018_ECCV,
author = {Zhao, Hang and Gan, Chuang and Rouditchenko, Andrew and Vondrick, Carl and McDermott, Josh and Torralba, Antonio},
title = {The Sound of Pixels},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Shape Reconstruction Using Volume Sweeping and Learned Photoconsistency
Leroy, Vincent and Franco, Jean-Sebastien and Boyer, Edmond
[pdf]
[bibtex]
@InProceedings{Leroy_2018_ECCV,
author = {Leroy, Vincent and Franco, Jean-Sebastien and Boyer, Edmond},
title = {Shape Reconstruction Using Volume Sweeping and Learned Photoconsistency},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Quantized Densely Connected U-Nets for Efficient Landmark Localization
Tang, Zhiqiang and Peng, Xi and Geng, Shijie and Wu, Lingfei and Zhang, Shaoting and Metaxas, Dimitris
[pdf]
[bibtex]
@InProceedings{Tang_2018_ECCV,
author = {Tang, Zhiqiang and Peng, Xi and Geng, Shijie and Wu, Lingfei and Zhang, Shaoting and Metaxas, Dimitris},
title = {Quantized Densely Connected U-Nets for Efficient Landmark Localization},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Joint 3D tracking of a deformable object in interaction with a hand
Tsoli, Aggeliki and Argyros, Antonis A.
[pdf]
[bibtex]
@InProceedings{Tsoli_2018_ECCV,
author = {Tsoli, Aggeliki and Argyros, Antonis A.},
title = {Joint 3D tracking of a deformable object in interaction with a hand},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Move Forward and Tell: A Progressive Generator of Video Descriptions
Xiong, Yilei and Dai, Bo and Lin, Dahua
[pdf]
[bibtex]
@InProceedings{Xiong_2018_ECCV,
author = {Xiong, Yilei and Dai, Bo and Lin, Dahua},
title = {Move Forward and Tell: A Progressive Generator of Video Descriptions},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Face Recognition with Contrastive Convolution
Han, Chunrui and Shan, Shiguang and Kan, Meina and Wu, Shuzhe and Chen, Xilin
[pdf]
[bibtex]
@InProceedings{Han_2018_ECCV,
author = {Han, Chunrui and Shan, Shiguang and Kan, Meina and Wu, Shuzhe and Chen, Xilin},
title = {Face Recognition with Contrastive Convolution},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Repeatability Is Not Enough: Learning Affine Regions via Discriminability
Mishkin, Dmytro and Radenovic, Filip and Matas, Jiri
[pdf]
[bibtex]
@InProceedings{Mishkin_2018_ECCV,
author = {Mishkin, Dmytro and Radenovic, Filip and Matas, Jiri},
title = {Repeatability Is Not Enough: Learning Affine Regions via Discriminability},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Tackling 3D ToF Artifacts Through Learning and the FLAT Dataset
Guo, Qi and Frosio, Iuri and Gallo, Orazio and Zickler, Todd and Kautz, Jan
[pdf]
[bibtex]
@InProceedings{Guo_2018_ECCV,
author = {Guo, Qi and Frosio, Iuri and Gallo, Orazio and Zickler, Todd and Kautz, Jan},
title = {Tackling 3D ToF Artifacts Through Learning and the FLAT Dataset},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Using LIP to Gloss Over Faces in Single-Stage Face Detection Networks
Yang, Siqi and Wiliem, Arnold and Chen, Shaokang and Lovell, Brian C.
[pdf]
[bibtex]
@InProceedings{Yang_2018_ECCV,
author = {Yang, Siqi and Wiliem, Arnold and Chen, Shaokang and Lovell, Brian C.},
title = {Using LIP to Gloss Over Faces in Single-Stage Face Detection Networks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Motion Feature Network: Fixed Motion Filter for Action Recognition
Lee, Myunggi and Lee, Seungeui and Son, Sungjoon and Park, Gyutae and Kwak, Nojun
[pdf]
[bibtex]
@InProceedings{Lee_2018_ECCV,
author = {Lee, Myunggi and Lee, Seungeui and Son, Sungjoon and Park, Gyutae and Kwak, Nojun},
title = {Motion Feature Network: Fixed Motion Filter for Action Recognition},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Towards Privacy-Preserving Visual Recognition via Adversarial Training: A Pilot Study
Wu, Zhenyu and Wang, Zhangyang and Wang, Zhaowen and Jin, Hailin
[pdf]
[bibtex]
@InProceedings{Wu_2018_ECCV,
author = {Wu, Zhenyu and Wang, Zhangyang and Wang, Zhaowen and Jin, Hailin},
title = {Towards Privacy-Preserving Visual Recognition via Adversarial Training: A Pilot Study},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Learning Compression from Limited Unlabeled Data
He, Xiangyu and Cheng, Jian
[pdf]
[bibtex]
@InProceedings{He_2018_ECCV,
author = {He, Xiangyu and Cheng, Jian},
title = {Learning Compression from Limited Unlabeled Data},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

DeepVS: A Deep Learning Based Video Saliency Prediction Approach
Jiang, Lai and Xu, Mai and Liu, Tie and Qiao, Minglang and Wang, Zulin
[pdf]
[bibtex]
@InProceedings{Jiang_2018_ECCV,
author = {Jiang, Lai and Xu, Mai and Liu, Tie and Qiao, Minglang and Wang, Zulin},
title = {DeepVS: A Deep Learning Based Video Saliency Prediction Approach},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

ADVIO: An Authentic Dataset for Visual-Inertial Odometry
Cortes, Santiago and Solin, Arno and Rahtu, Esa and Kannala, Juho
[pdf]
[bibtex]
@InProceedings{Cortes_2018_ECCV,
author = {Cortes, Santiago and Solin, Arno and Rahtu, Esa and Kannala, Juho},
title = {ADVIO: An Authentic Dataset for Visual-Inertial Odometry},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Adversarial Geometry-Aware Human Motion Prediction
Gui, Liang-Yan and Wang, Yu-Xiong and Liang, Xiaodan and Moura, Jose M. F.
[pdf]
[bibtex]
@InProceedings{Gui_2018_ECCV,
author = {Gui, Liang-Yan and Wang, Yu-Xiong and Liang, Xiaodan and Moura, Jose M. F.},
title = {Adversarial Geometry-Aware Human Motion Prediction},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Online Dictionary Learning for Approximate Archetypal Analysis
Mei, Jieru and Wang, Chunyu and Zeng, Wenjun
[pdf]
[bibtex]
@InProceedings{Mei_2018_ECCV,
author = {Mei, Jieru and Wang, Chunyu and Zeng, Wenjun},
title = {Online Dictionary Learning for Approximate Archetypal Analysis},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Rendering Portraitures from Monocular Camera and Beyond
Xu, Xiangyu and Sun, Deqing and Liu, Sifei and Ren, Wenqi and Zhang, Yu-Jin and Yang, Ming-Hsuan and Sun, Jian
[pdf]
[bibtex]
@InProceedings{Xu_2018_ECCV,
author = {Xu, Xiangyu and Sun, Deqing and Liu, Sifei and Ren, Wenqi and Zhang, Yu-Jin and Yang, Ming-Hsuan and Sun, Jian},
title = {Rendering Portraitures from Monocular Camera and Beyond},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Attributes as Operators: Factorizing Unseen Attribute-Object Compositions
Nagarajan, Tushar and Grauman, Kristen
[pdf]
[bibtex]
@InProceedings{Nagarajan_2018_ECCV,
author = {Nagarajan, Tushar and Grauman, Kristen},
title = {Attributes as Operators: Factorizing Unseen Attribute-Object Compositions},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Scaling Egocentric Vision: The EPIC-KITCHENS Dataset
Damen, Dima and Doughty, Hazel and Maria Farinella, Giovanni and Fidler, Sanja and Furnari, Antonino and Kazakos, Evangelos and Moltisanti, Davide and Munro, Jonathan and Perrett, Toby and Price, Will and Wray, Michael
[pdf]
[bibtex]
@InProceedings{Damen_2018_ECCV,
author = {Damen, Dima and Doughty, Hazel and Maria Farinella, Giovanni and Fidler, Sanja and Furnari, Antonino and Kazakos, Evangelos and Moltisanti, Davide and Munro, Jonathan and Perrett, Toby and Price, Will and Wray, Michael},
title = {Scaling Egocentric Vision: The EPIC-KITCHENS Dataset},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Realtime Time Synchronized Event-based Stereo
Zihao Zhu, Alex and Chen, Yibo and Daniilidis, Kostas
[pdf]
[bibtex]
@InProceedings{Zhu_2018_ECCV,
author = {Zihao Zhu, Alex and Chen, Yibo and Daniilidis, Kostas},
title = {Realtime Time Synchronized Event-based Stereo},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Memory Aware Synapses: Learning what (not) to forget
Aljundi, Rahaf and Babiloni, Francesca and Elhoseiny, Mohamed and Rohrbach, Marcus and Tuytelaars, Tinne
[pdf]
[bibtex]
@InProceedings{Aljundi_2018_ECCV,
author = {Aljundi, Rahaf and Babiloni, Francesca and Elhoseiny, Mohamed and Rohrbach, Marcus and Tuytelaars, Tinne},
title = {Memory Aware Synapses: Learning what (not) to forget },
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Learning and Matching Multi-View Descriptors for Registration of Point Clouds
Zhou, Lei and Zhu, Siyu and Luo, Zixin and Shen, Tianwei and Zhang, Runze and Zhen, Mingmin and Fang, Tian and Quan, Long
[pdf]
[bibtex]
@InProceedings{Zhou_2018_ECCV,
author = {Zhou, Lei and Zhu, Siyu and Luo, Zixin and Shen, Tianwei and Zhang, Runze and Zhen, Mingmin and Fang, Tian and Quan, Long},
title = {Learning and Matching Multi-View Descriptors for Registration of Point Clouds},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Semi-Dense 3D Reconstruction with a Stereo Event Camera
Zhou, Yi and Gallego, Guillermo and Rebecq, Henri and Kneip, Laurent and Li, Hongdong and Scaramuzza, Davide
[pdf]
[bibtex]
@InProceedings{Zhou_2018_ECCV,
author = {Zhou, Yi and Gallego, Guillermo and Rebecq, Henri and Kneip, Laurent and Li, Hongdong and Scaramuzza, Davide},
title = {Semi-Dense 3D Reconstruction with a Stereo Event Camera},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Scale-Awareness of Light Field Camera based Visual Odometry
Zeller, Niclas and Quint, Franz and Stilla, Uwe
[pdf]
[bibtex]
@InProceedings{Zeller_2018_ECCV,
author = {Zeller, Niclas and Quint, Franz and Stilla, Uwe},
title = {Scale-Awareness of Light Field Camera based Visual Odometry},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Revisiting Autofocus for Smartphone Cameras
Abuolaim, Abdullah and Punnappurath, Abhijith and Brown, Michael S.
[pdf]
[bibtex]
@InProceedings{Abuolaim_2018_ECCV,
author = {Abuolaim, Abdullah and Punnappurath, Abhijith and Brown, Michael S.},
title = {Revisiting Autofocus for Smartphone Cameras},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deep Adversarial Attention Alignment for Unsupervised Domain Adaptation: the Benefit of Target Expectation Maximization
Kang, Guoliang and Zheng, Liang and Yan, Yan and Yang, Yi
[pdf]
[bibtex]
@InProceedings{Kang_2018_ECCV,
author = {Kang, Guoliang and Zheng, Liang and Yan, Yan and Yang, Yi},
title = {Deep Adversarial Attention Alignment for Unsupervised Domain Adaptation: the Benefit of Target Expectation Maximization},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Efficient 6-DoF Tracking of Handheld Objects from an Egocentric Viewpoint
Pandey, Rohit and Pidlypenskyi, Pavel and Yang, Shuoran and Kaeser-Chen, Christine
[pdf]
[bibtex]
@InProceedings{Pandey_2018_ECCV,
author = {Pandey, Rohit and Pidlypenskyi, Pavel and Yang, Shuoran and Kaeser-Chen, Christine},
title = {Efficient 6-DoF Tracking of Handheld Objects from an Egocentric Viewpoint},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

How good is my GAN?
Shmelkov, Konstantin and Schmid, Cordelia and Alahari, Karteek
[pdf]
[bibtex]
@InProceedings{Shmelkov_2018_ECCV,
author = {Shmelkov, Konstantin and Schmid, Cordelia and Alahari, Karteek},
title = {How good is my GAN?},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Superpixel Sampling Networks
Jampani, Varun and Sun, Deqing and Liu, Ming-Yu and Yang, Ming-Hsuan and Kautz, Jan
[pdf]
[bibtex]
@InProceedings{Jampani_2018_ECCV,
author = {Jampani, Varun and Sun, Deqing and Liu, Ming-Yu and Yang, Ming-Hsuan and Kautz, Jan},
title = {Superpixel Sampling Networks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Effective Use of Synthetic Data for Urban Scene Semantic Segmentation
Sadat Saleh, Fatemeh and Sadegh Aliakbarian, Mohammad and Salzmann, Mathieu and Petersson, Lars and Alvarez, Jose M.
[pdf]
[bibtex]
@InProceedings{Saleh_2018_ECCV,
author = {Sadat Saleh, Fatemeh and Sadegh Aliakbarian, Mohammad and Salzmann, Mathieu and Petersson, Lars and Alvarez, Jose M.},
title = {Effective Use of Synthetic Data for Urban Scene Semantic Segmentation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Generating 3D Faces using Convolutional Mesh Autoencoders
Ranjan, Anurag and Bolkart, Timo and Sanyal, Soubhik and Black, Michael J.
[pdf]
[bibtex]
@InProceedings{Ranjan_2018_ECCV,
author = {Ranjan, Anurag and Bolkart, Timo and Sanyal, Soubhik and Black, Michael J.},
title = {Generating 3D Faces using Convolutional Mesh Autoencoders},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

3D Face Reconstruction from Light Field Images: A Model-free Approach
Feng, Mingtao and Zulqarnain Gilani, Syed and Wang, Yaonan and Mian, Ajmal
[pdf]
[bibtex]
@InProceedings{Feng_2018_ECCV,
author = {Feng, Mingtao and Zulqarnain Gilani, Syed and Wang, Yaonan and Mian, Ajmal},
title = {3D Face Reconstruction from Light Field Images: A Model-free Approach},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Attention-aware Deep Adversarial Hashing for Cross-Modal Retrieval
Zhang, Xi and Lai, Hanjiang and Feng, Jiashi
[pdf]
[bibtex]
@InProceedings{Zhang_2018_ECCV,
author = {Zhang, Xi and Lai, Hanjiang and Feng, Jiashi},
title = {Attention-aware Deep Adversarial Hashing for Cross-Modal Retrieval},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Museum Exhibit Identification Challenge for the Supervised Domain Adaptation and Beyond
Koniusz, Piotr and Tas, Yusuf and Zhang, Hongguang and Harandi, Mehrtash and Porikli, Fatih and Zhang, Rui
[pdf]
[bibtex]
@InProceedings{Koniusz_2018_ECCV,
author = {Koniusz, Piotr and Tas, Yusuf and Zhang, Hongguang and Harandi, Mehrtash and Porikli, Fatih and Zhang, Rui},
title = {Museum Exhibit Identification Challenge for the Supervised Domain Adaptation and Beyond},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

End-to-End Deep Structured Models for Drawing Crosswalks
Liang, Justin and Urtasun, Raquel
[pdf]
[bibtex]
@InProceedings{Liang_2018_ECCV,
author = {Liang, Justin and Urtasun, Raquel},
title = {End-to-End Deep Structured Models for Drawing Crosswalks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Learning Visual Question Answering by Bootstrapping Hard Attention
Malinowski, Mateusz and Doersch, Carl and Santoro, Adam and Battaglia, Peter
[pdf]
[bibtex]
@InProceedings{Malinowski_2018_ECCV,
author = {Malinowski, Mateusz and Doersch, Carl and Santoro, Adam and Battaglia, Peter},
title = {Learning Visual Question Answering by Bootstrapping Hard Attention},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deep Adaptive Attention for Joint Facial Action Unit Detection and Face Alignment
Shao, Zhiwen and Liu, Zhilei and Cai, Jianfei and Ma, Lizhuang
[pdf]
[bibtex]
@InProceedings{Shao_2018_ECCV,
author = {Shao, Zhiwen and Liu, Zhilei and Cai, Jianfei and Ma, Lizhuang},
title = {Deep Adaptive Attention for Joint Facial Action Unit Detection and Face Alignment},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Data-Driven Sparse Structure Selection for Deep Neural Networks
Huang, Zehao and Wang, Naiyan
[pdf]
[bibtex]
@InProceedings{Huang_2018_ECCV,
author = {Huang, Zehao and Wang, Naiyan},
title = {Data-Driven Sparse Structure Selection for Deep Neural Networks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

To learn image super-resolution, use a GAN to learn how to do image degradation first
Bulat, Adrian and Yang, Jing and Tzimiropoulos, Georgios
[pdf]
[bibtex]
@InProceedings{Bulat_2018_ECCV,
author = {Bulat, Adrian and Yang, Jing and Tzimiropoulos, Georgios},
title = {To learn image super-resolution, use a GAN to learn how to do image degradation first},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Self-Supervised Relative Depth Learning for Urban Scene Understanding
Jiang, Huaizu and Larsson, Gustav and Maire Greg Shakhnarovich, Michael and Learned-Miller, Erik
[pdf]
[bibtex]
@InProceedings{Jiang_2018_ECCV,
author = {Jiang, Huaizu and Larsson, Gustav and Maire Greg Shakhnarovich, Michael and Learned-Miller, Erik},
title = {Self-Supervised Relative Depth Learning for Urban Scene Understanding},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

End-to-End Joint Semantic Segmentation of Actors and Actions in Video
Ji, Jingwei and Buch, Shyamal and Soto, Alvaro and Niebles, Juan Carlos
[pdf]
[bibtex]
@InProceedings{Ji_2018_ECCV,
author = {Ji, Jingwei and Buch, Shyamal and Soto, Alvaro and Niebles, Juan Carlos},
title = {End-to-End Joint Semantic Segmentation of Actors and Actions in Video},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deep Texture and Structure Aware Filtering Network for Image Smoothing
Lu, Kaiyue and You, Shaodi and Barnes, Nick
[pdf]
[bibtex]
@InProceedings{Lu_2018_ECCV,
author = {Lu, Kaiyue and You, Shaodi and Barnes, Nick},
title = {Deep Texture and Structure Aware Filtering Network for Image Smoothing},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Pairwise Relational Networks for Face Recognition
Kang, Bong-Nam and Kim, Yonghyun and Kim, Daijin
[pdf]
[bibtex]
@InProceedings{Kang_2018_ECCV,
author = {Kang, Bong-Nam and Kim, Yonghyun and Kim, Daijin},
title = {Pairwise Relational Networks for Face Recognition},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

LAPRAN: A Scalable Laplacian Pyramid Reconstructive Adversarial Network for Flexible Compressive Sensing Reconstruction
XU, Kai and Zhang, Zhikang and Ren, Fengbo
[pdf]
[bibtex]
@InProceedings{XU_2018_ECCV,
author = {XU, Kai and Zhang, Zhikang and Ren, Fengbo},
title = {LAPRAN: A Scalable Laplacian Pyramid Reconstructive Adversarial Network for Flexible Compressive Sensing Reconstruction},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Learning Warped Guidance for Blind Face Restoration
Li, Xiaoming and Liu, Ming and Ye, Yuting and Zuo, Wangmeng and Lin, Liang and Yang, Ruigang
[pdf]
[bibtex]
@InProceedings{Li_2018_ECCV,
author = {Li, Xiaoming and Liu, Ming and Ye, Yuting and Zuo, Wangmeng and Lin, Liang and Yang, Ruigang},
title = {Learning Warped Guidance for Blind Face Restoration},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Shift-Net: Image Inpainting via Deep Feature Rearrangement
Yan, Zhaoyi and Li, Xiaoming and Li, Mu and Zuo, Wangmeng and Shan, Shiguang
[pdf]
[bibtex]
@InProceedings{Yan_2018_ECCV,
author = {Yan, Zhaoyi and Li, Xiaoming and Li, Mu and Zuo, Wangmeng and Shan, Shiguang},
title = {Shift-Net: Image Inpainting via Deep Feature Rearrangement},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Question-Guided Hybrid Convolution for Visual Question Answering
Gao, Peng and Li, Hongsheng and Li, Shuang and Lu, Pan and Li, Yikang and Hoi, Steven C.H. and Wang, Xiaogang
[pdf]
[bibtex]
@InProceedings{Gao_2018_ECCV,
author = {Gao, Peng and Li, Hongsheng and Li, Shuang and Lu, Pan and Li, Yikang and Hoi, Steven C.H. and Wang, Xiaogang},
title = {Question-Guided Hybrid Convolution for Visual Question Answering},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Disentangling Factors of Variation with Cycle-Consistent Variational Auto-Encoders
Jha, Ananya Harsh and Anand, Saket and Singh, Maneesh and Veeravasarapu, VSR
[pdf]
[bibtex]
@InProceedings{Jha_2018_ECCV,
author = {Jha, Ananya Harsh and Anand, Saket and Singh, Maneesh and Veeravasarapu, VSR},
title = {Disentangling Factors of Variation with Cycle-Consistent Variational Auto-Encoders},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deep Fundamental Matrix Estimation
Ranftl, Rene and Koltun, Vladlen
[pdf]
[bibtex]
@InProceedings{Ranftl_2018_ECCV,
author = {Ranftl, Rene and Koltun, Vladlen},
title = {Deep Fundamental Matrix Estimation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Where are the blobs: Counting by Localization with Point Supervision
Laradji, Issam H. and Rostamzadeh, Negar and Pinheiro, Pedro O. and Vazquez, David and Schmidt, Mark
[pdf]
[bibtex]
@InProceedings{Laradji_2018_ECCV,
author = {Laradji, Issam H. and Rostamzadeh, Negar and Pinheiro, Pedro O. and Vazquez, David and Schmidt, Mark},
title = {Where are the blobs: Counting by Localization with Point Supervision},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Pose Guided Human Video Generation
Yang, Ceyuan and Wang, Zhe and Zhu, Xinge and Huang, Chen and Shi, Jianping and Lin, Dahua
[pdf]
[bibtex]
@InProceedings{Yang_2018_ECCV,
author = {Yang, Ceyuan and Wang, Zhe and Zhu, Xinge and Huang, Chen and Shi, Jianping and Lin, Dahua},
title = {Pose Guided Human Video Generation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Real-time 'Actor-Critic' Tracking
Chen, Boyu and Wang, Dong and Li, Peixia and Wang, Shuang and Lu, Huchuan
[pdf]
[bibtex]
@InProceedings{Chen_2018_ECCV,
author = {Chen, Boyu and Wang, Dong and Li, Peixia and Wang, Shuang and Lu, Huchuan},
title = {Real-time 'Actor-Critic' Tracking},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Estimating the Success of Unsupervised Image to Image Translation
Benaim, Sagie and Galanti, Tomer and Wolf, Lior
[pdf]
[bibtex]
@InProceedings{Benaim_2018_ECCV,
author = {Benaim, Sagie and Galanti, Tomer and Wolf, Lior},
title = {Estimating the Success of Unsupervised Image to Image Translation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deep Bilevel Learning
Jenni, Simon and Favaro, Paolo
[pdf]
[bibtex]
@InProceedings{Jenni_2018_ECCV,
author = {Jenni, Simon and Favaro, Paolo},
title = {Deep Bilevel Learning},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Sparsely Aggregated Convolutional Networks
Zhu, Ligeng and Deng, Ruizhi and Maire, Michael and Deng, Zhiwei and Mori, Greg and Tan, Ping
[pdf]
[bibtex]
@InProceedings{Zhu_2018_ECCV,
author = {Zhu, Ligeng and Deng, Ruizhi and Maire, Michael and Deng, Zhiwei and Mori, Greg and Tan, Ping},
title = {Sparsely Aggregated Convolutional Networks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Interpretable Intuitive Physics Model
Ye, Tian and Wang, Xiaolong and Davidson, James and Gupta, Abhinav
[pdf]
[bibtex]
@InProceedings{Ye_2018_ECCV,
author = {Ye, Tian and Wang, Xiaolong and Davidson, James and Gupta, Abhinav},
title = {Interpretable Intuitive Physics Model},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Appearance-Based Gaze Estimation via Evaluation-Guided Asymmetric Regression
Cheng, Yihua and Lu, Feng and Zhang, Xucong
[pdf]
[bibtex]
@InProceedings{Cheng_2018_ECCV,
author = {Cheng, Yihua and Lu, Feng and Zhang, Xucong},
title = {Appearance-Based Gaze Estimation via Evaluation-Guided Asymmetric Regression},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

ADVISE: Symbolism and External Knowledge for Decoding Advertisements
Ye, Keren and Kovashka, Adriana
[pdf]
[bibtex]
@InProceedings{Ye_2018_ECCV,
author = {Ye, Keren and Kovashka, Adriana},
title = {ADVISE: Symbolism and External Knowledge for Decoding Advertisements},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Toward Characteristic-Preserving Image-based Virtual Try-On Network
Wang, Bochao and Zheng, Huabin and Liang, Xiaodan and Chen, Yimin and Lin, Liang and Yang, Meng
[pdf]
[bibtex]
@InProceedings{Wang_2018_ECCV,
author = {Wang, Bochao and Zheng, Huabin and Liang, Xiaodan and Chen, Yimin and Lin, Liang and Yang, Meng},
title = {Toward Characteristic-Preserving Image-based Virtual Try-On Network},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

A Closed-form Solution to Photorealistic Image Stylization
Li, Yijun and Liu, Ming-Yu and Li, Xueting and Yang, Ming-Hsuan and Kautz, Jan
[pdf]
[bibtex]
@InProceedings{Li_2018_ECCV,
author = {Li, Yijun and Liu, Ming-Yu and Li, Xueting and Yang, Ming-Hsuan and Kautz, Jan},
title = {A Closed-form Solution to Photorealistic Image Stylization},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Understanding Degeneracies and Ambiguities in Attribute Transfer
Szabo, Attila and Hu, Qiyang and Portenier, Tiziano and Zwicker, Matthias and Favaro, Paolo
[pdf]
[bibtex]
@InProceedings{Szabo_2018_ECCV,
author = {Szabo, Attila and Hu, Qiyang and Portenier, Tiziano and Zwicker, Matthias and Favaro, Paolo},
title = {Understanding Degeneracies and Ambiguities in Attribute Transfer},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Bi-Real Net: Enhancing the Performance of 1-bit CNNs with Improved Representational Capability and Advanced Training Algorithm
Liu, Zechun and Wu, Baoyuan and Luo, Wenhan and Yang, Xin and Liu, Wei and Cheng, Kwang-Ting
[pdf]
[bibtex]
@InProceedings{Liu_2018_ECCV,
author = {Liu, Zechun and Wu, Baoyuan and Luo, Wenhan and Yang, Xin and Liu, Wei and Cheng, Kwang-Ting},
title = {Bi-Real Net: Enhancing the Performance of 1-bit CNNs with Improved Representational Capability and Advanced Training Algorithm},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Temporal Modular Networks for Retrieving Complex Compositional Activities in Videos
Liu, Bingbin and Yeung, Serena and Chou, Edward and Huang, De-An and Fei-Fei, Li and Niebles, Juan Carlos
[pdf]
[bibtex]
@InProceedings{Liu_2018_ECCV,
author = {Liu, Bingbin and Yeung, Serena and Chou, Edward and Huang, De-An and Fei-Fei, Li and Niebles, Juan Carlos},
title = {Temporal Modular Networks for Retrieving Complex Compositional Activities in Videos},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Neural Stereoscopic Image Style Transfer
Gong, Xinyu and Huang, Haozhi and Ma, Lin and Shen, Fumin and Liu, Wei and Zhang, Tong
[pdf]
[bibtex]
@InProceedings{Gong_2018_ECCV,
author = {Gong, Xinyu and Huang, Haozhi and Ma, Lin and Shen, Fumin and Liu, Wei and Zhang, Tong},
title = {Neural Stereoscopic Image Style Transfer},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

HiDDeN: Hiding Data with Deep Networks
Zhu, Jiren and Kaplan, Russell and Johnson, Justin and Fei-Fei, Li
[pdf]
[bibtex]
@InProceedings{Zhu_2018_ECCV,
author = {Zhu, Jiren and Kaplan, Russell and Johnson, Justin and Fei-Fei, Li},
title = {HiDDeN: Hiding Data with Deep Networks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Occlusion-aware Hand Pose Estimation Using Hierarchical Mixture Density Network
Ye, Qi and Kim, Tae-Kyun
[pdf]
[bibtex]
@InProceedings{Ye_2018_ECCV,
author = {Ye, Qi and Kim, Tae-Kyun},
title = {Occlusion-aware Hand Pose Estimation Using Hierarchical Mixture Density Network},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Conditional Prior Networks for Optical Flow
Yang, Yanchao and Soatto, Stefano
[pdf]
[bibtex]
@InProceedings{Yang_2018_ECCV,
author = {Yang, Yanchao and Soatto, Stefano},
title = {Conditional Prior Networks for Optical Flow},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Learning 3D Keypoint Descriptors for Non-Rigid Shape Matching
Wang, Hanyu and Guo, Jianwei and Yan, Dong-Ming and Quan, Weize and Zhang, Xiaopeng
[pdf]
[bibtex]
@InProceedings{Wang_2018_ECCV,
author = {Wang, Hanyu and Guo, Jianwei and Yan, Dong-Ming and Quan, Weize and Zhang, Xiaopeng},
title = {Learning 3D Keypoint Descriptors for Non-Rigid Shape Matching},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Stacked Cross Attention for Image-Text Matching
Lee, Kuang-Huei and Chen, Xi and Hua, Gang and Hu, Houdong and He, Xiaodong
[pdf]
[bibtex]
@InProceedings{Lee_2018_ECCV,
author = {Lee, Kuang-Huei and Chen, Xi and Hua, Gang and Hu, Houdong and He, Xiaodong},
title = {Stacked Cross Attention for Image-Text Matching},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Video Summarization Using Fully Convolutional Sequence Networks
Rochan, Mrigank and Ye, Linwei and Wang, Yang
[pdf]
[bibtex]
@InProceedings{Rochan_2018_ECCV,
author = {Rochan, Mrigank and Ye, Linwei and Wang, Yang},
title = {Video Summarization Using Fully Convolutional Sequence Networks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Unveiling the Power of Deep Tracking
Bhat, Goutam and Johnander, Joakim and Danelljan, Martin and Shahbaz Khan, Fahad and Felsberg, Michael
[pdf]
[bibtex]
@InProceedings{Bhat_2018_ECCV,
author = {Bhat, Goutam and Johnander, Joakim and Danelljan, Martin and Shahbaz Khan, Fahad and Felsberg, Michael},
title = {Unveiling the Power of Deep Tracking},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Weakly Supervised Region Proposal Network and Object Detection
Tang, Peng and Wang, Xinggang and Wang, Angtian and Yan, Yongluan and Liu, Wenyu and Huang, Junzhou and Yuille, Alan
[pdf]
[bibtex]
@InProceedings{Tang_2018_ECCV,
author = {Tang, Peng and Wang, Xinggang and Wang, Angtian and Yan, Yongluan and Liu, Wenyu and Huang, Junzhou and Yuille, Alan},
title = {Weakly Supervised Region Proposal Network and Object Detection},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

The Devil of Face Recognition is in the Noise
Wang, Fei and Chen, Liren and Li, Cheng and Huang, Shiyao and Chen, Yanjie and Qian, Chen and Change Loy, Chen
[pdf]
[bibtex]
@InProceedings{Wang_2018_ECCV,
author = {Wang, Fei and Chen, Liren and Li, Cheng and Huang, Shiyao and Chen, Yanjie and Qian, Chen and Change Loy, Chen},
title = {The Devil of Face Recognition is in the Noise},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

SwapNet: Garment Transfer in Single View Images
Raj, Amit and Sangkloy, Patsorn and Chang, Huiwen and Lu, Jingwan and Ceylan, Duygu and Hays, James
[pdf]
[bibtex]
@InProceedings{Raj_2018_ECCV,
author = {Raj, Amit and Sangkloy, Patsorn and Chang, Huiwen and Lu, Jingwan and Ceylan, Duygu and Hays, James},
title = {SwapNet: Garment Transfer in Single View Images},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Egocentric Activity Prediction via Event Modulated Attention
Shen, Yang and Ni, Bingbing and Li, Zefan and Zhuang, Ning
[pdf]
[bibtex]
@InProceedings{Shen_2018_ECCV,
author = {Shen, Yang and Ni, Bingbing and Li, Zefan and Zhuang, Ning},
title = {Egocentric Activity Prediction via Event Modulated Attention},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Person Search in Videos with One Portrait Through Visual and Temporal Links
Huang, Qingqiu and Liu, Wentao and Lin, Dahua
[pdf]
[bibtex]
@InProceedings{Huang_2018_ECCV,
author = {Huang, Qingqiu and Liu, Wentao and Lin, Dahua},
title = {Person Search in Videos with One Portrait Through Visual and Temporal Links},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Stereo Computation for a Single Mixture Image
Zhong, Yiran and Dai, Yuchao and Li, Hongdong
[pdf]
[bibtex]
@InProceedings{Zhong_2018_ECCV,
author = {Zhong, Yiran and Dai, Yuchao and Li, Hongdong},
title = {Stereo Computation for a Single Mixture Image},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Value-aware Quantization for Training and Inference of Neural Networks
Park, Eunhyeok and Yoo, Sungjoo and Vajda, Peter
[pdf]
[bibtex]
@InProceedings{Park_2018_ECCV,
author = {Park, Eunhyeok and Yoo, Sungjoo and Vajda, Peter},
title = {Value-aware Quantization for Training and Inference of Neural Networks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Explainable Neural Computation via Stack Neural Module Networks
Hu, Ronghang and Andreas, Jacob and Darrell, Trevor and Saenko, Kate
[pdf]
[bibtex]
@InProceedings{Hu_2018_ECCV,
author = {Hu, Ronghang and Andreas, Jacob and Darrell, Trevor and Saenko, Kate},
title = {Explainable Neural Computation via Stack Neural Module Networks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Semi-supervised Adversarial Learning to Generate Photorealistic Face Images of New Identities from 3D Morphable Model
Gecer, Baris and Bhattarai, Binod and Kittler, Josef and Kim, Tae-Kyun
[pdf]
[bibtex]
@InProceedings{Gecer_2018_ECCV,
author = {Gecer, Baris and Bhattarai, Binod and Kittler, Josef and Kim, Tae-Kyun},
title = {Semi-supervised Adversarial Learning to Generate Photorealistic Face Images of New Identities from 3D Morphable Model},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

TBN: Convolutional Neural Network with Ternary Inputs and Binary Weights
Wan, Diwen and Shen, Fumin and Liu, Li and Zhu, Fan and Qin, Jie and Shao, Ling and Tao Shen, Heng
[pdf]
[bibtex]
@InProceedings{Wan_2018_ECCV,
author = {Wan, Diwen and Shen, Fumin and Liu, Li and Zhu, Fan and Qin, Jie and Shao, Ling and Tao Shen, Heng},
title = {TBN: Convolutional Neural Network with Ternary Inputs and Binary Weights},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Verisimilar Image Synthesis for Accurate Detection and Recognition of Texts in Scenes
Zhan, Fangneng and Lu, Shijian and Xue, Chuhui
[pdf]
[bibtex]
@InProceedings{Zhan_2018_ECCV,
author = {Zhan, Fangneng and Lu, Shijian and Xue, Chuhui},
title = {Verisimilar Image Synthesis for Accurate Detection and Recognition of Texts in Scenes},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

What do I Annotate Next? An Empirical Study of Active Learning for Action Localization
Caba Heilbron, Fabian and Lee, Joon-Young and Jin, Hailin and Ghanem, Bernard
[pdf]
[bibtex]
@InProceedings{Heilbron_2018_ECCV,
author = {Caba Heilbron, Fabian and Lee, Joon-Young and Jin, Hailin and Ghanem, Bernard},
title = {What do I Annotate Next? An Empirical Study of Active Learning for Action Localization},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

An Adversarial Approach to Hard Triplet Generation
Zhao, Yiru and Jin, Zhongming and Qi, Guo-jun and Lu, Hongtao and Hua, Xian-sheng
[pdf]
[bibtex]
@InProceedings{Zhao_2018_ECCV,
author = {Zhao, Yiru and Jin, Zhongming and Qi, Guo-jun and Lu, Hongtao and Hua, Xian-sheng},
title = {An Adversarial Approach to Hard Triplet Generation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Interactive Boundary Prediction for Object Selection
Le, Hoang and Mai, Long and Price, Brian and Cohen, Scott and Jin, Hailin and Liu, Feng
[pdf]
[bibtex]
@InProceedings{Le_2018_ECCV,
author = {Le, Hoang and Mai, Long and Price, Brian and Cohen, Scott and Jin, Hailin and Liu, Feng},
title = {Interactive Boundary Prediction for Object Selection},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

TrackingNet: A Large-Scale Dataset and Benchmark for Object Tracking in the Wild
Muller, Matthias and Bibi, Adel and Giancola, Silvio and Alsubaihi, Salman and Ghanem, Bernard
[pdf]
[bibtex]
@InProceedings{Muller_2018_ECCV,
author = {Muller, Matthias and Bibi, Adel and Giancola, Silvio and Alsubaihi, Salman and Ghanem, Bernard},
title = {TrackingNet: A Large-Scale Dataset and Benchmark for Object Tracking in the Wild},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Concept Mask: Large-Scale Segmentation from Semantic Concepts
Wang, Yufei and Lin, Zhe and Shen, Xiaohui and Zhang, Jianming and Cohen, Scott
[pdf]
[bibtex]
@InProceedings{Wang_2018_ECCV,
author = {Wang, Yufei and Lin, Zhe and Shen, Xiaohui and Zhang, Jianming and Cohen, Scott},
title = {Concept Mask: Large-Scale Segmentation from Semantic Concepts},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Simultaneous 3D Reconstruction for Water Surface and Underwater Scene
Qian, Yiming and Zheng, Yinqiang and Gong, Minglun and Yang, Yee-Hong
[pdf]
[bibtex]
@InProceedings{Qian_2018_ECCV,
author = {Qian, Yiming and Zheng, Yinqiang and Gong, Minglun and Yang, Yee-Hong},
title = {Simultaneous 3D Reconstruction for Water Surface and Underwater Scene},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

SegStereo: Exploiting Semantic Information for Disparity Estimation
Yang, Guorun and Zhao, Hengshuang and Shi, Jianping and Deng, Zhidong and Jia, Jiaya
[pdf]
[bibtex]
@InProceedings{Yang_2018_ECCV,
author = {Yang, Guorun and Zhao, Hengshuang and Shi, Jianping and Deng, Zhidong and Jia, Jiaya},
title = {SegStereo: Exploiting Semantic Information for Disparity Estimation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

3D-CODED: 3D Correspondences by Deep Deformation
Groueix, Thibault and Fisher, Matthew and Kim, Vladimir G. and Russell, Bryan C. and Aubry, Mathieu
[pdf]
[bibtex]
@InProceedings{Groueix_2018_ECCV,
author = {Groueix, Thibault and Fisher, Matthew and Kim, Vladimir G. and Russell, Bryan C. and Aubry, Mathieu},
title = {3D-CODED: 3D Correspondences by Deep Deformation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deep Virtual Stereo Odometry: Leveraging Deep Depth Prediction for Monocular Direct Sparse Odometry
Yang, Nan and Wang, Rui and Stuckler, Jorg and Cremers, Daniel
[pdf]
[bibtex]
@InProceedings{Yang_2018_ECCV,
author = {Yang, Nan and Wang, Rui and Stuckler, Jorg and Cremers, Daniel},
title = {Deep Virtual Stereo Odometry: Leveraging Deep Depth Prediction for Monocular Direct Sparse Odometry},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Single Image Intrinsic Decomposition without a Single Intrinsic Image
Ma, Wei-Chiu and Chu, Hang and Zhou, Bolei and Urtasun, Raquel and Torralba, Antonio
[pdf]
[bibtex]
@InProceedings{Ma_2018_ECCV,
author = {Ma, Wei-Chiu and Chu, Hang and Zhou, Bolei and Urtasun, Raquel and Torralba, Antonio},
title = {Single Image Intrinsic Decomposition without a Single Intrinsic Image},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deep Model-Based 6D Pose Refinement in RGB
Manhardt, Fabian and Kehl, Wadim and Navab, Nassir and Tombari, Federico
[pdf]
[bibtex]
@InProceedings{Manhardt_2018_ECCV,
author = {Manhardt, Fabian and Kehl, Wadim and Navab, Nassir and Tombari, Federico},
title = {Deep Model-Based 6D Pose Refinement in RGB},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Learning-based Video Motion Magnification
Oh, Tae-Hyun and Jaroensri, Ronnachai and Kim, Changil and Elgharib, Mohamed and Durand, Fr'edo and Freeman, William T. and Matusik, Wojciech
[pdf]
[bibtex]
@InProceedings{Oh_2018_ECCV,
author = {Oh, Tae-Hyun and Jaroensri, Ronnachai and Kim, Changil and Elgharib, Mohamed and Durand, Fr'edo and Freeman, William T. and Matusik, Wojciech},
title = {Learning-based Video Motion Magnification},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

DeepJDOT: Deep Joint Distribution Optimal Transport for Unsupervised Domain Adaptation
Bhushan Damodaran, Bharath and Kellenberger, Benjamin and Flamary, Remi and Tuia, Devis and Courty, Nicolas
[pdf]
[bibtex]
@InProceedings{Damodaran_2018_ECCV,
author = {Bhushan Damodaran, Bharath and Kellenberger, Benjamin and Flamary, Remi and Tuia, Devis and Courty, Nicolas},
title = {DeepJDOT: Deep Joint Distribution Optimal Transport for Unsupervised Domain Adaptation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Pose Proposal Networks
Sekii, Taiki
[pdf]
[bibtex]
@InProceedings{Sekii_2018_ECCV,
author = {Sekii, Taiki},
title = {Pose Proposal Networks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deep Regionlets for Object Detection
Xu, Hongyu and Lv, Xutao and Wang, Xiaoyu and Ren, Zhou and Bodla, Navaneeth and Chellappa, Rama
[pdf]
[bibtex]
@InProceedings{Xu_2018_ECCV,
author = {Xu, Hongyu and Lv, Xutao and Wang, Xiaoyu and Ren, Zhou and Bodla, Navaneeth and Chellappa, Rama},
title = {Deep Regionlets for Object Detection},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Learning with Biased Complementary Labels
Yu, Xiyu and Liu, Tongliang and Gong, Mingming and Tao, Dacheng
[pdf]
[bibtex]
@InProceedings{Yu_2018_ECCV,
author = {Yu, Xiyu and Liu, Tongliang and Gong, Mingming and Tao, Dacheng},
title = {Learning with Biased Complementary Labels},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

BSN: Boundary Sensitive Network for Temporal Action Proposal Generation
Lin, Tianwei and Zhao, Xu and Su, Haisheng and Wang, Chongjing and Yang, Ming
[pdf]
[bibtex]
@InProceedings{Lin_2018_ECCV,
author = {Lin, Tianwei and Zhao, Xu and Su, Haisheng and Wang, Chongjing and Yang, Ming},
title = {BSN: Boundary Sensitive Network for Temporal Action Proposal Generation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Visual Reasoning with Multi-hop Feature Modulation
Strub, Florian and Seurin, Mathieu and Perez, Ethan and de Vries, Harm and Mary, Jeremie and Preux, Philippe and CourvilleOlivier Pietquin, Aaron
[pdf]
[bibtex]
@InProceedings{Strub_2018_ECCV,
author = {Strub, Florian and Seurin, Mathieu and Perez, Ethan and de Vries, Harm and Mary, Jeremie and Preux, Philippe and CourvilleOlivier Pietquin, Aaron},
title = {Visual Reasoning with Multi-hop Feature Modulation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Multiresolution Tree Networks for 3D Point Cloud Processing
Gadelha, Matheus and Wang, Rui and Maji, Subhransu
[pdf]
[bibtex]
@InProceedings{Gadelha_2018_ECCV,
author = {Gadelha, Matheus and Wang, Rui and Maji, Subhransu},
title = {Multiresolution Tree Networks for 3D Point Cloud Processing},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Seeing Tree Structure from Vibration
Xue, Tianfan and Wu, Jiajun and Zhang, Zhoutong and Zhang, Chengkai and Tenenbaum, Joshua B. and Freeman, William T.
[pdf]
[bibtex]
@InProceedings{Xue_2018_ECCV,
author = {Xue, Tianfan and Wu, Jiajun and Zhang, Zhoutong and Zhang, Chengkai and Tenenbaum, Joshua B. and Freeman, William T.},
title = {Seeing Tree Structure from Vibration},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

DDRNet: Depth Map Denoising and Refinement for Consumer Depth Cameras Using Cascaded CNNs
Yan, Shi and Wu, Chenglei and Wang, Lizhen and Xu, Feng and An, Liang and Guo, Kaiwen and Liu, Yebin
[pdf]
[bibtex]
@InProceedings{Yan_2018_ECCV,
author = {Yan, Shi and Wu, Chenglei and Wang, Lizhen and Xu, Feng and An, Liang and Guo, Kaiwen and Liu, Yebin},
title = {DDRNet: Depth Map Denoising and Refinement for Consumer Depth Cameras Using Cascaded CNNs},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Probabilistic Video Generation using Holistic Attribute Control
He, Jiawei and Lehrmann, Andreas and Marino, Joseph and Mori, Greg and Sigal, Leonid
[pdf]
[bibtex]
@InProceedings{He_2018_ECCV,
author = {He, Jiawei and Lehrmann, Andreas and Marino, Joseph and Mori, Greg and Sigal, Leonid},
title = {Probabilistic Video Generation using Holistic Attribute Control},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Video Re-localization
Feng, Yang and Ma, Lin and Liu, Wei and Zhang, Tong and Luo, Jiebo
[pdf]
[bibtex]
@InProceedings{Feng_2018_ECCV,
author = {Feng, Yang and Ma, Lin and Liu, Wei and Zhang, Tong and Luo, Jiebo},
title = {Video Re-localization},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Adversarial Open-World Person Re-Identification
Li, Xiang and Wu, Ancong and Zheng, Wei-Shi
[pdf]
[bibtex]
@InProceedings{Li_2018_ECCV,
author = {Li, Xiang and Wu, Ancong and Zheng, Wei-Shi},
title = {Adversarial Open-World Person Re-Identification},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Geometric Constrained Joint Lane Segmentation and Lane Boundary Detection
Zhang, Jie and Xu, Yi and Ni, Bingbing and Duan, Zhenyu
[pdf]
[bibtex]
@InProceedings{Zhang_2018_ECCV,
author = {Zhang, Jie and Xu, Yi and Ni, Bingbing and Duan, Zhenyu},
title = {Geometric Constrained Joint Lane Segmentation and Lane Boundary Detection},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

A Geometric Perspective on Structured Light Coding
Gupta, Mohit and Nakhate, Nikhil
[pdf]
[bibtex]
@InProceedings{Gupta_2018_ECCV,
author = {Gupta, Mohit and Nakhate, Nikhil},
title = {A Geometric Perspective on Structured Light Coding},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Modular Generative Adversarial Networks
Zhao, Bo and Chang, Bo and Jie, Zequn and Sigal, Leonid
[pdf]
[bibtex]
@InProceedings{Zhao_2018_ECCV,
author = {Zhao, Bo and Chang, Bo and Jie, Zequn and Sigal, Leonid},
title = {Modular Generative Adversarial Networks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

SRFeat: Single Image Super-Resolution with Feature Discrimination
Park, Seong-Jin and Son, Hyeongseok and Cho, Sunghyun and Hong, Ki-Sang and Lee, Seungyong
[pdf]
[bibtex]
@InProceedings{Park_2018_ECCV,
author = {Park, Seong-Jin and Son, Hyeongseok and Cho, Sunghyun and Hong, Ki-Sang and Lee, Seungyong},
title = {SRFeat: Single Image Super-Resolution with Feature Discrimination},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Skeleton-Based Action Recognition with Spatial Reasoning and Temporal Stack Learning
Si, Chenyang and Jing, Ya and Wang, Wei and Wang, Liang and Tan, Tieniu
[pdf]
[bibtex]
@InProceedings{Si_2018_ECCV,
author = {Si, Chenyang and Jing, Ya and Wang, Wei and Wang, Liang and Tan, Tieniu},
title = {Skeleton-Based Action Recognition with Spatial Reasoning and Temporal Stack Learning},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Self-produced Guidance for Weakly-supervised Object Localization
Zhang, Xiaolin and Wei, Yunchao and Kang, Guoliang and Yang, Yi and Huang, Thomas
[pdf]
[bibtex]
@InProceedings{Zhang_2018_ECCV,
author = {Zhang, Xiaolin and Wei, Yunchao and Kang, Guoliang and Yang, Yi and Huang, Thomas},
title = {Self-produced Guidance for Weakly-supervised Object Localization},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Self-Calibration of Cameras with Euclidean Image Plane in Case of Two Views and Known Relative Rotation Angle
Martyushev, Evgeniy
[pdf]
[bibtex]
@InProceedings{Martyushev_2018_ECCV,
author = {Martyushev, Evgeniy},
title = {Self-Calibration of Cameras with Euclidean Image Plane in Case of Two Views and Known Relative Rotation Angle},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

RIDI: Robust IMU Double Integration
Yan, Hang and Shan, Qi and Furukawa, Yasutaka
[pdf]
[bibtex]
@InProceedings{Yan_2018_ECCV,
author = {Yan, Hang and Shan, Qi and Furukawa, Yasutaka},
title = {RIDI: Robust IMU Double Integration},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Learning Monocular Depth by Distilling Cross-domain Stereo Networks
Guo, Xiaoyang and Li, Hongsheng and Yi, Shuai and Ren, Jimmy and Wang, Xiaogang
[pdf]
[bibtex]
@InProceedings{Guo_2018_ECCV,
author = {Guo, Xiaoyang and Li, Hongsheng and Yi, Shuai and Ren, Jimmy and Wang, Xiaogang},
title = {Learning Monocular Depth by Distilling Cross-domain Stereo Networks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Fully Motion-Aware Network for Video Object Detection
Wang, Shiyao and Zhou, Yucong and Yan, Junjie and Deng, Zhidong
[pdf]
[bibtex]
@InProceedings{Wang_2018_ECCV,
author = {Wang, Shiyao and Zhou, Yucong and Yan, Junjie and Deng, Zhidong},
title = {Fully Motion-Aware Network for Video Object Detection},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

GridFace: Face Rectification via Learning Local Homography Transformations
Zhou, Erjin and Cao, Zhimin and Sun, Jian
[pdf]
[bibtex]
@InProceedings{Zhou_2018_ECCV,
author = {Zhou, Erjin and Cao, Zhimin and Sun, Jian},
title = {GridFace: Face Rectification via Learning Local Homography Transformations},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deep Feature Pyramid Reconfiguration for Object Detection
Kong, Tao and Sun, Fuchun and Tan, Chuanqi and Liu, Huaping and Huang, Wenbing
[pdf]
[bibtex]
@InProceedings{Kong_2018_ECCV,
author = {Kong, Tao and Sun, Fuchun and Tan, Chuanqi and Liu, Huaping and Huang, Wenbing},
title = {Deep Feature Pyramid Reconfiguration for Object Detection},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Does Haze Removal Help CNN-based Image Classification?
Pei, Yanting and Huang, Yaping and Zou, Qi and Lu, Yuhang and Wang, Song
[pdf]
[bibtex]
@InProceedings{Pei_2018_ECCV,
author = {Pei, Yanting and Huang, Yaping and Zou, Qi and Lu, Yuhang and Wang, Song},
title = {Does Haze Removal Help CNN-based Image Classification?},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Multi-modal Cycle-consistent Generalized Zero-Shot Learning
Felix, Rafael and Kumar, Vijay B. G. and Reid, Ian and Carneiro, Gustavo
[pdf]
[bibtex]
@InProceedings{Felix_2018_ECCV,
author = {Felix, Rafael and Kumar, Vijay B. G. and Reid, Ian and Carneiro, Gustavo},
title = {Multi-modal Cycle-consistent Generalized Zero-Shot Learning},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

YouTube-VOS: Sequence-to-Sequence Video Object Segmentation
Xu, Ning and Yang, Linjie and Fan, Yuchen and Yang, Jianchao and Yue, Dingcheng and Liang, Yuchen and Price, Brian and Cohen, Scott and Huang, Thomas
[pdf]
[bibtex]
@InProceedings{Xu_2018_ECCV,
author = {Xu, Ning and Yang, Linjie and Fan, Yuchen and Yang, Jianchao and Yue, Dingcheng and Liang, Yuchen and Price, Brian and Cohen, Scott and Huang, Thomas},
title = {YouTube-VOS: Sequence-to-Sequence Video Object Segmentation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Generalizing A Person Retrieval Model Hetero- and Homogeneously
Zhong, Zhun and Zheng, Liang and Li, Shaozi and Yang, Yi
[pdf]
[bibtex]
@InProceedings{Zhong_2018_ECCV,
author = {Zhong, Zhun and Zheng, Liang and Li, Shaozi and Yang, Yi},
title = {Generalizing A Person Retrieval Model Hetero- and Homogeneously},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

DYAN: A Dynamical Atoms-Based Network For Video Prediction
Liu, Wenqian and Sharma, Abhishek and Camps, Octavia and Sznaier, Mario
[pdf]
[bibtex]
@InProceedings{Liu_2018_ECCV,
author = {Liu, Wenqian and Sharma, Abhishek and Camps, Octavia and Sznaier, Mario},
title = {DYAN: A Dynamical Atoms-Based Network For Video Prediction},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation
Dai, Angela and Niessner, Matthias
[pdf]
[bibtex]
@InProceedings{Dai_2018_ECCV,
author = {Dai, Angela and Niessner, Matthias},
title = {3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

WildDash - Creating Hazard-Aware Benchmarks
Zendel, Oliver and Honauer, Katrin and Murschitz, Markus and Steininger, Daniel and Fernandez Dominguez, Gustavo
[pdf]
[bibtex]
@InProceedings{Zendel_2018_ECCV,
author = {Zendel, Oliver and Honauer, Katrin and Murschitz, Markus and Steininger, Daniel and Fernandez Dominguez, Gustavo},
title = {WildDash - Creating Hazard-Aware Benchmarks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Adaptively Transforming Graph Matching
Wang, Fudong and Xue, Nan and Zhang, Yipeng and Bai, Xiang and Xia, Gui-Song
[pdf]
[bibtex]
@InProceedings{Wang_2018_ECCV,
author = {Wang, Fudong and Xue, Nan and Zhang, Yipeng and Bai, Xiang and Xia, Gui-Song},
title = {Adaptively Transforming Graph Matching},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Learning to Look around Objects for Top-View Representations of Outdoor Scenes
Schulter, Samuel and Zhai, Menghua and Jacobs, Nathan and Chandraker, Manmohan
[pdf]
[bibtex]
@InProceedings{Schulter_2018_ECCV,
author = {Schulter, Samuel and Zhai, Menghua and Jacobs, Nathan and Chandraker, Manmohan},
title = {Learning to Look around Objects for Top-View Representations of Outdoor Scenes},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Visual Psychophysics for Making Face Recognition Algorithms More Explainable
RichardWebster, Brandon and Yon Kwon, So and Clarizio, Christopher and Anthony, Samuel E. and Scheirer, Walter J.
[pdf]
[bibtex]
@InProceedings{RichardWebster_2018_ECCV,
author = {RichardWebster, Brandon and Yon Kwon, So and Clarizio, Christopher and Anthony, Samuel E. and Scheirer, Walter J.},
title = {Visual Psychophysics for Making Face Recognition Algorithms More Explainable},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Eigendecomposition-free Training of Deep Networks with Zero Eigenvalue-based Losses
Dang, Zheng and Moo Yi, Kwang and Hu, Yinlin and Wang, Fei and Fua, Pascal and Salzmann, Mathieu
[pdf]
[bibtex]
@InProceedings{Dang_2018_ECCV,
author = {Dang, Zheng and Moo Yi, Kwang and Hu, Yinlin and Wang, Fei and Fua, Pascal and Salzmann, Mathieu},
title = {Eigendecomposition-free Training of Deep Networks with Zero Eigenvalue-based Losses},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deep Domain Generalization via Conditional Invariant Adversarial Networks
Li, Ya and Tian, Xinmei and Gong, Mingming and Liu, Yajing and Liu, Tongliang and Zhang, Kun and Tao, Dacheng
[pdf]
[bibtex]
@InProceedings{Li_2018_ECCV,
author = {Li, Ya and Tian, Xinmei and Gong, Mingming and Liu, Yajing and Liu, Tongliang and Zhang, Kun and Tao, Dacheng},
title = {Deep Domain Generalization via Conditional Invariant Adversarial Networks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Local Spectral Graph Convolution for Point Set Feature Learning
Wang, Chu and Samari, Babak and Siddiqi, Kaleem
[pdf]
[bibtex]
@InProceedings{Wang_2018_ECCV,
author = {Wang, Chu and Samari, Babak and Siddiqi, Kaleem},
title = {Local Spectral Graph Convolution for Point Set Feature Learning},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Fighting Fake News: Image Splice Detection via Learned Self-Consistency
Huh, Minyoung and Liu, Andrew and Owens, Andrew and Efros, Alexei A.
[pdf]
[bibtex]
@InProceedings{Huh_2018_ECCV,
author = {Huh, Minyoung and Liu, Andrew and Owens, Andrew and Efros, Alexei A.},
title = {Fighting Fake News: Image Splice Detection via Learned Self-Consistency},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Receptive Field Block Net for Accurate and Fast Object Detection
Liu, Songtao and Huang, Di and Wang, andYunhong
[pdf]
[bibtex]
@InProceedings{Liu_2018_ECCV,
author = {Liu, Songtao and Huang, Di and Wang, andYunhong},
title = {Receptive Field Block Net for Accurate and Fast Object Detection},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Dual-Agent Deep Reinforcement Learning for Deformable Face Tracking
Guo, Minghao and Lu, Jiwen and Zhou, Jie
[pdf]
[bibtex]
@InProceedings{Guo_2018_ECCV,
author = {Guo, Minghao and Lu, Jiwen and Zhou, Jie},
title = {Dual-Agent Deep Reinforcement Learning for Deformable Face Tracking},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Online Multi-Object Tracking with Dual Matching Attention Networks
Zhu, Ji and Yang, Hua and Liu, Nian and Kim, Minyoung and Zhang, Wenjun and Yang, Ming-Hsuan
[pdf]
[bibtex]
@InProceedings{Zhu_2018_ECCV,
author = {Zhu, Ji and Yang, Hua and Liu, Nian and Kim, Minyoung and Zhang, Wenjun and Yang, Ming-Hsuan},
title = {Online Multi-Object Tracking with Dual Matching Attention Networks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Simultaneous Edge Alignment and Learning
Yu, Zhiding and Liu, Weiyang and Zou, Yang and Feng, Chen and Ramalingam, Srikumar and Vijaya Kumar, B. V. K. and Kautz, Jan
[pdf]
[bibtex]
@InProceedings{Yu_2018_ECCV,
author = {Yu, Zhiding and Liu, Weiyang and Zou, Yang and Feng, Chen and Ramalingam, Srikumar and Vijaya Kumar, B. V. K. and Kautz, Jan},
title = {Simultaneous Edge Alignment and Learning},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Weakly-supervised Video Summarization using Variational Encoder-Decoder and Web Prior
Cai, Sijia and Zuo, Wangmeng and Davis, Larry S. and Zhang, Lei
[pdf]
[bibtex]
@InProceedings{Cai_2018_ECCV,
author = {Cai, Sijia and Zuo, Wangmeng and Davis, Larry S. and Zhang, Lei},
title = {Weakly-supervised Video Summarization using Variational Encoder-Decoder and Web Prior},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Toward Scale-Invariance and Position-Sensitive Region Proposal Networks
Lu, Hsueh-Fu and Du, Xiaofei and Chang, Ping-Lin
[pdf]
[bibtex]
@InProceedings{Lu_2018_ECCV,
author = {Lu, Hsueh-Fu and Du, Xiaofei and Chang, Ping-Lin},
title = {Toward Scale-Invariance and Position-Sensitive Region Proposal Networks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Visual Question Answering as a Meta Learning Task
Teney, Damien and van den Hengel, Anton
[pdf]
[bibtex]
@InProceedings{Teney_2018_ECCV,
author = {Teney, Damien and van den Hengel, Anton},
title = {Visual Question Answering as a Meta Learning Task},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Generative Semantic Manipulation with Mask-Contrasting GAN
Liang, Xiaodan and Zhang, Hao and Lin, Liang and Xing, Eric
[pdf]
[bibtex]
@InProceedings{Liang_2018_ECCV,
author = {Liang, Xiaodan and Zhang, Hao and Lin, Liang and Xing, Eric},
title = {Generative Semantic Manipulation with Mask-Contrasting GAN},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

End-to-End Learning of Driving Models with Surround-View Cameras and Route Planners
Hecker, Simon and Dai, Dengxin and Van Gool, Luc
[pdf]
[bibtex]
@InProceedings{Hecker_2018_ECCV,
author = {Hecker, Simon and Dai, Dengxin and Van Gool, Luc},
title = {End-to-End Learning of Driving Models with Surround-View Cameras and Route Planners},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deep High Dynamic Range Imaging with Large Foreground Motions
Wu, Shangzhe and Xu, Jiarui and Tai, Yu-Wing and Tang, Chi-Keung
[pdf]
[bibtex]
@InProceedings{Wu_2018_ECCV,
author = {Wu, Shangzhe and Xu, Jiarui and Tai, Yu-Wing and Tang, Chi-Keung},
title = {Deep High Dynamic Range Imaging with Large Foreground Motions},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Hierarchical Relational Networks for Group Activity Recognition and Retrieval
Ibrahim, Mostafa S. and Mori, Greg
[pdf]
[bibtex]
@InProceedings{Ibrahim_2018_ECCV,
author = {Ibrahim, Mostafa S. and Mori, Greg},
title = {Hierarchical Relational Networks for Group Activity Recognition and Retrieval},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

GeoDesc: Learning Local Descriptors by Integrating Geometry Constraints
Luo, Zixin and Shen, Tianwei and Zhou, Lei and Zhu, Siyu and Zhang, Runze and Yao, Yao and Fang, Tian and Quan, Long
[pdf]
[bibtex]
@InProceedings{Luo_2018_ECCV,
author = {Luo, Zixin and Shen, Tianwei and Zhou, Lei and Zhu, Siyu and Zhang, Runze and Yao, Yao and Fang, Tian and Quan, Long},
title = {GeoDesc: Learning Local Descriptors by Integrating Geometry Constraints},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

SDC-Net: Video prediction using spatially-displaced convolution
Reda, Fitsum A. and Liu, Guilin and Shih, Kevin J. and Kirby, Robert and Barker, Jon and Tarjan, David and Tao, Andrew and Catanzaro, Bryan
[pdf]
[bibtex]
@InProceedings{Reda_2018_ECCV,
author = {Reda, Fitsum A. and Liu, Guilin and Shih, Kevin J. and Kirby, Robert and Barker, Jon and Tarjan, David and Tao, Andrew and Catanzaro, Bryan},
title = {SDC-Net: Video prediction using spatially-displaced convolution},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Efficient Sliding Window Computation for NN-Based Template Matching
Talker, Lior and Moses, Yael and Shimshoni, Ilan
[pdf]
[bibtex]
@InProceedings{Talker_2018_ECCV,
author = {Talker, Lior and Moses, Yael and Shimshoni, Ilan},
title = {Efficient Sliding Window Computation for NN-Based Template Matching},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

RefocusGAN: Scene Refocusing using a Single Image
Sakurikar, Parikshit and Mehta, Ishit and Balasubramanian, Vineeth N. and Narayanan, P. J.
[pdf]
[bibtex]
@InProceedings{Sakurikar_2018_ECCV,
author = {Sakurikar, Parikshit and Mehta, Ishit and Balasubramanian, Vineeth N. and Narayanan, P. J.},
title = {RefocusGAN: Scene Refocusing using a Single Image},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Action Search: Spotting Actions in Videos and Its Application to Temporal Action Localization
Alwassel, Humam and Caba Heilbron, Fabian and Ghanem, Bernard
[pdf]
[bibtex]
@InProceedings{Alwassel_2018_ECCV,
author = {Alwassel, Humam and Caba Heilbron, Fabian and Ghanem, Bernard},
title = {Action Search: Spotting Actions in Videos and Its Application to Temporal Action Localization},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Joint Blind Motion Deblurring and Depth Estimation of Light Field
Lee, Dongwoo and Park, Haesol and Kyu Park, In and Mu Lee, Kyoung
[pdf]
[bibtex]
@InProceedings{Lee_2018_ECCV,
author = {Lee, Dongwoo and Park, Haesol and Kyu Park, In and Mu Lee, Kyoung},
title = {Joint Blind Motion Deblurring and Depth Estimation of Light Field},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Mutual Learning to Adapt for Joint Human Parsing and Pose Estimation
Nie, Xuecheng and Feng, Jiashi and Yan, Shuicheng
[pdf]
[bibtex]
@InProceedings{Nie_2018_ECCV,
author = {Nie, Xuecheng and Feng, Jiashi and Yan, Shuicheng},
title = {Mutual Learning to Adapt for Joint Human Parsing and Pose Estimation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

DOCK: Detecting Objects by transferring Common-sense Knowledge
Kumar Singh, Krishna and Divvala, Santosh and Farhadi, Ali and Jae Lee, Yong
[pdf]
[bibtex]
@InProceedings{Singh_2018_ECCV,
author = {Kumar Singh, Krishna and Divvala, Santosh and Farhadi, Ali and Jae Lee, Yong},
title = {DOCK: Detecting Objects by transferring Common-sense Knowledge},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Simple Baselines for Human Pose Estimation and Tracking
Xiao, Bin and Wu, Haiping and Wei, Yichen
[pdf]
[bibtex]
@InProceedings{Xiao_2018_ECCV,
author = {Xiao, Bin and Wu, Haiping and Wei, Yichen},
title = {Simple Baselines for Human Pose Estimation and Tracking},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

PM-GANs: Discriminative Representation Learning for Action Recognition Using Partial-modalities
Wang, Lan and Gao, Chenqiang and Yang, Luyu and Zhao, Yue and Zuo, Wangmeng and Meng, Deyu
[pdf]
[bibtex]
@InProceedings{Wang_2018_ECCV,
author = {Wang, Lan and Gao, Chenqiang and Yang, Luyu and Zhao, Yue and Zuo, Wangmeng and Meng, Deyu},
title = {PM-GANs: Discriminative Representation Learning for Action Recognition Using Partial-modalities},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

CAR-Net: Clairvoyant Attentive Recurrent Network
Sadeghian, Amir and Legros, Ferdinand and Voisin, Maxime and Vesel, Ricky and Alahi, Alexandre and Savarese, Silvio
[pdf]
[bibtex]
@InProceedings{Sadeghian_2018_ECCV,
author = {Sadeghian, Amir and Legros, Ferdinand and Voisin, Maxime and Vesel, Ricky and Alahi, Alexandre and Savarese, Silvio},
title = {CAR-Net: Clairvoyant Attentive Recurrent Network},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Dynamic Filtering with Large Sampling Field for ConvNets
Wu, Jialin and Li, Dai and Yang, Yu and Bajaj, Chandrajit and Ji, Xiangyang
[pdf]
[bibtex]
@InProceedings{Wu_2018_ECCV,
author = {Wu, Jialin and Li, Dai and Yang, Yu and Bajaj, Chandrajit and Ji, Xiangyang},
title = {Dynamic Filtering with Large Sampling Field for ConvNets},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Learning Category-Specific Mesh Reconstruction from Image Collections
Kanazawa, Angjoo and Tulsiani, Shubham and Efros, Alexei A. and Malik, Jitendra
[pdf]
[bibtex]
@InProceedings{Kanazawa_2018_ECCV,
author = {Kanazawa, Angjoo and Tulsiani, Shubham and Efros, Alexei A. and Malik, Jitendra},
title = {Learning Category-Specific Mesh Reconstruction from Image Collections},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Clustering Convolutional Kernels to Compress Deep Neural Networks
Son, Sanghyun and Nah, Seungjun and Mu Lee, Kyoung
[pdf]
[bibtex]
@InProceedings{Son_2018_ECCV,
author = {Son, Sanghyun and Nah, Seungjun and Mu Lee, Kyoung},
title = {Clustering Convolutional Kernels to Compress Deep Neural Networks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

CornerNet: Detecting Objects as Paired Keypoints
Law, Hei and Deng, Jia
[pdf]
[bibtex]
@InProceedings{Law_2018_ECCV,
author = {Law, Hei and Deng, Jia},
title = {CornerNet: Detecting Objects as Paired Keypoints},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Efficient Dense Point Cloud Object Reconstruction using Deformation Vector Fields
Li, Kejie and Pham, Trung and Zhan, Huangying and Reid, Ian
[pdf]
[bibtex]
@InProceedings{Li_2018_ECCV,
author = {Li, Kejie and Pham, Trung and Zhan, Huangying and Reid, Ian},
title = {Efficient Dense Point Cloud Object Reconstruction using Deformation Vector Fields},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Choose Your Neuron: Incorporating Domain Knowledge through Neuron-Importance
Selvaraju, Ramprasaath R. and Chattopadhyay, Prithvijit and Elhoseiny, Mohamed and Sharma, Tilak and Batra, Dhruv and Parikh, Devi and Lee, Stefan
[pdf]
[bibtex]
@InProceedings{Selvaraju_2018_ECCV,
author = {Selvaraju, Ramprasaath R. and Chattopadhyay, Prithvijit and Elhoseiny, Mohamed and Sharma, Tilak and Batra, Dhruv and Parikh, Devi and Lee, Stefan},
title = {Choose Your Neuron: Incorporating Domain Knowledge through Neuron-Importance},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Hashing with Binary Matrix Pursuit
Cakir, Fatih and He, Kun and Sclaroff, Stan
[pdf]
[bibtex]
@InProceedings{Cakir_2018_ECCV,
author = {Cakir, Fatih and He, Kun and Sclaroff, Stan},
title = {Hashing with Binary Matrix Pursuit},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Recognition in Terra Incognita
Beery, Sara and Van Horn, Grant and Perona, Pietro
[pdf]
[bibtex]
@InProceedings{Beery_2018_ECCV,
author = {Beery, Sara and Van Horn, Grant and Perona, Pietro},
title = {Recognition in Terra Incognita},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Fast and Accurate Intrinsic Symmetry Detection
Nagar, Rajendra and Raman, Shanmuganathan
[pdf]
[bibtex]
@InProceedings{Nagar_2018_ECCV,
author = {Nagar, Rajendra and Raman, Shanmuganathan},
title = {Fast and Accurate Intrinsic Symmetry Detection},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Massively Parallel Video Networks
Carreira, Joao and Patraucean, Viorica and Mazare, Laurent and Zisserman, Andrew and Osindero, Simon
[pdf]
[bibtex]
@InProceedings{Carreira_2018_ECCV,
author = {Carreira, Joao and Patraucean, Viorica and Mazare, Laurent and Zisserman, Andrew and Osindero, Simon},
title = {Massively Parallel Video Networks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

ExFuse: Enhancing Feature Fusion for Semantic Segmentation
Zhang, Zhenli and Zhang, Xiangyu and Peng, Chao and Xue, Xiangyang and Sun, Jian
[pdf]
[bibtex]
@InProceedings{Zhang_2018_ECCV,
author = {Zhang, Zhenli and Zhang, Xiangyu and Peng, Chao and Xue, Xiangyang and Sun, Jian},
title = {ExFuse: Enhancing Feature Fusion for Semantic Segmentation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Collaborative Deep Reinforcement Learning for Multi-Object Tracking
Ren, Liangliang and Lu, Jiwen and Wang, Zifeng and Tian, Qi and Zhou, Jie
[pdf]
[bibtex]
@InProceedings{Ren_2018_ECCV,
author = {Ren, Liangliang and Lu, Jiwen and Wang, Zifeng and Tian, Qi and Zhou, Jie},
title = {Collaborative Deep Reinforcement Learning for Multi-Object Tracking},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deep Variational Metric Learning
Lin, Xudong and Duan, Yueqi and Dong, Qiyuan and Lu, Jiwen and Zhou, Jie
[pdf]
[bibtex]
@InProceedings{Lin_2018_ECCV,
author = {Lin, Xudong and Duan, Yueqi and Dong, Qiyuan and Lu, Jiwen and Zhou, Jie},
title = {Deep Variational Metric Learning},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

MVTec D2S: Densely Segmented Supermarket Dataset
Follmann, Patrick and Bottger, Tobias and Hartinger, Philipp and Konig, Rebecca and Ulrich, Markus
[pdf]
[bibtex]
@InProceedings{Follmann_2018_ECCV,
author = {Follmann, Patrick and Bottger, Tobias and Hartinger, Philipp and Konig, Rebecca and Ulrich, Markus},
title = {MVTec D2S: Densely Segmented Supermarket Dataset},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Robust fitting in computer vision: easy or hard?
Chin, Tat-Jun and Cai, Zhipeng and Neumann, Frank
[pdf]
[bibtex]
@InProceedings{Chin_2018_ECCV,
author = {Chin, Tat-Jun and Cai, Zhipeng and Neumann, Frank},
title = {Robust fitting in computer vision: easy or hard?},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Visual Question Generation for Class Acquisition of Unknown Objects
Uehara, Kohei and Tejero-De-Pablos, Antonio and Ushiku, Yoshitaka and Harada, Tatsuya
[pdf]
[bibtex]
@InProceedings{Uehara_2018_ECCV,
author = {Uehara, Kohei and Tejero-De-Pablos, Antonio and Ushiku, Yoshitaka and Harada, Tatsuya},
title = {Visual Question Generation for Class Acquisition of Unknown Objects},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Image Manipulation with Perceptual Discriminators
Sungatullina, Diana and Zakharov, Egor and Ulyanov, Dmitry and Lempitsky, Victor
[pdf]
[bibtex]
@InProceedings{Sungatullina_2018_ECCV,
author = {Sungatullina, Diana and Zakharov, Egor and Ulyanov, Dmitry and Lempitsky, Victor},
title = {Image Manipulation with Perceptual Discriminators},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Pairwise Confusion for Fine-Grained Visual Classification
Dubey, Abhimanyu and Gupta, Otkrist and Guo, Pei and Raskar, Ramesh and Farrell, Ryan and Naik, Nikhil
[pdf]
[bibtex]
@InProceedings{Dubey_2018_ECCV,
author = {Dubey, Abhimanyu and Gupta, Otkrist and Guo, Pei and Raskar, Ramesh and Farrell, Ryan and Naik, Nikhil},
title = {Pairwise Confusion for Fine-Grained Visual Classification},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Combining 3D Model Contour Energy and Keypoints for Object Tracking
Bugaev, Bogdan and Kryshchenko, Anton and Belov, Roman
[pdf]
[bibtex]
@InProceedings{Bugaev_2018_ECCV,
author = {Bugaev, Bogdan and Kryshchenko, Anton and Belov, Roman},
title = {Combining 3D Model Contour Energy and Keypoints for Object Tracking},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Quadtree Convolutional Neural Networks
Kumar Jayaraman, Pradeep and Mei, Jianhan and Cai, Jianfei and Zheng, Jianmin
[pdf]
[bibtex]
@InProceedings{Jayaraman_2018_ECCV,
author = {Kumar Jayaraman, Pradeep and Mei, Jianhan and Cai, Jianfei and Zheng, Jianmin},
title = {Quadtree Convolutional Neural Networks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deep Recursive HDRI: Inverse Tone Mapping using Generative Adversarial Networks
Lee, Siyeong and Hwan An, Gwon and Kang, Suk-Ju
[pdf]
[bibtex]
@InProceedings{Lee_2018_ECCV,
author = {Lee, Siyeong and Hwan An, Gwon and Kang, Suk-Ju},
title = {Deep Recursive HDRI: Inverse Tone Mapping using Generative Adversarial Networks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Open Set Learning with Counterfactual Images
Neal, Lawrence and Olson, Matthew and Fern, Xiaoli and Wong, Weng-Keen and Li, Fuxin
[pdf]
[bibtex]
@InProceedings{Neal_2018_ECCV,
author = {Neal, Lawrence and Olson, Matthew and Fern, Xiaoli and Wong, Weng-Keen and Li, Fuxin},
title = {Open Set Learning with Counterfactual Images},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Implicit 3D Orientation Learning for 6D Object Detection from RGB Images
Sundermeyer, Martin and Marton, Zoltan-Csaba and Durner, Maximilian and Brucker, Manuel and Triebel, Rudolph
[pdf]
[bibtex]
@InProceedings{Sundermeyer_2018_ECCV,
author = {Sundermeyer, Martin and Marton, Zoltan-Csaba and Durner, Maximilian and Brucker, Manuel and Triebel, Rudolph},
title = {Implicit 3D Orientation Learning for 6D Object Detection from RGB Images},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Compressing the Input for CNNs with the First-Order Scattering Transform
Oyallon, Edouard and Belilovsky, Eugene and Zagoruyko, Sergey and Valko, Michal
[pdf]
[bibtex]
@InProceedings{Oyallon_2018_ECCV,
author = {Oyallon, Edouard and Belilovsky, Eugene and Zagoruyko, Sergey and Valko, Michal},
title = {Compressing the Input for CNNs with the First-Order Scattering Transform},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Part-Aligned Bilinear Representations for Person Re-Identification
Suh, Yumin and Wang, Jingdong and Tang, Siyu and Mei, Tao and Mu Lee, Kyoung
[pdf]
[bibtex]
@InProceedings{Suh_2018_ECCV,
author = {Suh, Yumin and Wang, Jingdong and Tang, Siyu and Mei, Tao and Mu Lee, Kyoung},
title = {Part-Aligned Bilinear Representations for Person Re-Identification},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Sidekick Policy Learning for Active Visual Exploration
Ramakrishnan, Santhosh K. and Grauman, Kristen
[pdf]
[bibtex]
@InProceedings{Ramakrishnan_2018_ECCV,
author = {Ramakrishnan, Santhosh K. and Grauman, Kristen},
title = {Sidekick Policy Learning for Active Visual Exploration},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

HGMR: Hierarchical Gaussian Mixtures for Adaptive 3D Registration
Eckart, B. and Kim, K. and Kautz, J.
[pdf]
[bibtex]
@InProceedings{Eckart_2018_ECCV,
author = {Eckart, B. and Kim, K. and Kautz, J.},
title = {HGMR: Hierarchical Gaussian Mixtures for Adaptive 3D Registration},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Multi-Attention Multi-Class Constraint for Fine-grained Image Recognition
Sun, Ming and Yuan, Yuchen and Zhou, Feng and Ding, Errui
[pdf]
[bibtex]
@InProceedings{Sun_2018_ECCV,
author = {Sun, Ming and Yuan, Yuchen and Zhou, Feng and Ding, Errui},
title = {Multi-Attention Multi-Class Constraint for Fine-grained Image Recognition},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

From Face Recognition to Models of Identity: A Bayesian Approach to Learning about Unknown Identities from Unsupervised Data
Coelho de Castro, Daniel and Nowozin, Sebastian
[pdf]
[bibtex]
@InProceedings{Castro_2018_ECCV,
author = {Coelho de Castro, Daniel and Nowozin, Sebastian},
title = {From Face Recognition to Models of Identity: A Bayesian Approach to Learning about Unknown Identities from Unsupervised Data},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Semi-convolutional Operators for Instance Segmentation
Novotny, David and Albanie, Samuel and Larlus, Diane and Vedaldi, Andrea
[pdf]
[bibtex]
@InProceedings{Novotny_2018_ECCV,
author = {Novotny, David and Albanie, Samuel and Larlus, Diane and Vedaldi, Andrea},
title = {Semi-convolutional Operators for Instance Segmentation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Bi-box Regression for Pedestrian Detection and Occlusion Estimation
Zhou, Chunluan and Yuan, Junsong
[pdf]
[bibtex]
@InProceedings{Zhou_2018_ECCV,
author = {Zhou, Chunluan and Yuan, Junsong},
title = {Bi-box Regression for Pedestrian Detection and Occlusion Estimation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Learning Data Terms for Non-blind Deblurring
Dong, Jiangxin and Pan, Jinshan and Sun, Deqing and Su, Zhixun and Yang, Ming-Hsuan
[pdf]
[bibtex]
@InProceedings{Dong_2018_ECCV,
author = {Dong, Jiangxin and Pan, Jinshan and Sun, Deqing and Su, Zhixun and Yang, Ming-Hsuan},
title = {Learning Data Terms for Non-blind Deblurring},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Unified Perceptual Parsing for Scene Understanding
Xiao, Tete and Liu, Yingcheng and Zhou, Bolei and Jiang, Yuning and Sun, Jian
[pdf]
[bibtex]
@InProceedings{Xiao_2018_ECCV,
author = {Xiao, Tete and Liu, Yingcheng and Zhou, Bolei and Jiang, Yuning and Sun, Jian},
title = {Unified Perceptual Parsing for Scene Understanding},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Face Super-resolution Guided by Facial Component Heatmaps
Yu, Xin and Fernando, Basura and Ghanem, Bernard and Porikli, Fatih and Hartley, Richard
[pdf]
[bibtex]
@InProceedings{Yu_2018_ECCV,
author = {Yu, Xin and Fernando, Basura and Ghanem, Bernard and Porikli, Fatih and Hartley, Richard},
title = {Face Super-resolution Guided by Facial Component Heatmaps},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Descending, lifting or smoothing: Secrets of robust cost optimization
Zach, Christopher and Bourmaud, Guillaume
[pdf]
[bibtex]
@InProceedings{Zach_2018_ECCV,
author = {Zach, Christopher and Bourmaud, Guillaume},
title = {Descending, lifting or smoothing: Secrets of robust cost optimization},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

ExplainGAN: Model Explanation via Decision Boundary Crossing Transformations
Samangouei, Pouya and Saeedi, Ardavan and Nakagawa, Liam and Silberman, Nathan
[pdf]
[bibtex]
@InProceedings{Samangouei_2018_ECCV,
author = {Samangouei, Pouya and Saeedi, Ardavan and Nakagawa, Liam and Silberman, Nathan},
title = {ExplainGAN: Model Explanation via Decision Boundary Crossing Transformations},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

A Unified Framework for Multi-View Multi-Class Object Pose Estimation
Li, Chi and Bai, Jin and Hager, Gregory D.
[pdf]
[bibtex]
@InProceedings{Li_2018_ECCV,
author = {Li, Chi and Bai, Jin and Hager, Gregory D.},
title = {A Unified Framework for Multi-View Multi-Class Object Pose Estimation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Spatio-Temporal Channel Correlation Networks for Action Classification
Diba, Ali and Fayyaz, Mohsen and Sharma, Vivek and Mahdi Arzani, M. and Yousefzadeh, Rahman and Gall, Juergen and Van Gool, Luc
[pdf]
[bibtex]
@InProceedings{Diba_2018_ECCV,
author = {Diba, Ali and Fayyaz, Mohsen and Sharma, Vivek and Mahdi Arzani, M. and Yousefzadeh, Rahman and Gall, Juergen and Van Gool, Luc},
title = {Spatio-Temporal Channel Correlation Networks for Action Classification},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Learning to Reconstruct High-quality 3D Shapes with Cascaded Fully Convolutional Networks
Cao, Yan-Pei and Liu, Zheng-Ning and Kuang, Zheng-Fei and Kobbelt, Leif and Hu, Shi-Min
[pdf]
[bibtex]
@InProceedings{Cao_2018_ECCV,
author = {Cao, Yan-Pei and Liu, Zheng-Ning and Kuang, Zheng-Fei and Kobbelt, Leif and Hu, Shi-Min},
title = {Learning to Reconstruct High-quality 3D Shapes with Cascaded Fully Convolutional Networks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Characterizing Adversarial Examples Based on Spatial Consistency Information for Semantic Segmentation
Xiao, Chaowei and Deng, Ruizhi and Li, Bo and Yu, Fisher and Liu, Mingyan and Song, Dawn
[pdf]
[bibtex]
@InProceedings{Xiao_2018_ECCV,
author = {Xiao, Chaowei and Deng, Ruizhi and Li, Bo and Yu, Fisher and Liu, Mingyan and Song, Dawn},
title = {Characterizing Adversarial Examples Based on Spatial Consistency Information for Semantic Segmentation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deep Bilinear Learning for RGB-D Action Recognition
Hu, Jian-Fang and Zheng, Wei-Shi and Pan, Jiahui and Lai, Jianhuang and Zhang, Jianguo
[pdf]
[bibtex]
@InProceedings{Hu_2018_ECCV,
author = {Hu, Jian-Fang and Zheng, Wei-Shi and Pan, Jiahui and Lai, Jianhuang and Zhang, Jianguo},
title = {Deep Bilinear Learning for RGB-D Action Recognition},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Coded Two-Bucket Cameras for Computer Vision
Wei, Mian and Sarhangnejad, Navid and Xia, Zhengfan and Gusev, Nikita and Katic, Nikola and Genov, Roman and Kutulakos, Kiriakos N.
[pdf]
[bibtex]
@InProceedings{Wei_2018_ECCV,
author = {Wei, Mian and Sarhangnejad, Navid and Xia, Zhengfan and Gusev, Nikita and Katic, Nikola and Genov, Roman and Kutulakos, Kiriakos N.},
title = {Coded Two-Bucket Cameras for Computer Vision},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Few-Shot Human Motion Prediction via Meta-Learning
Gui, Liang-Yan and Wang, Yu-Xiong and Ramanan, Deva and Moura, Jose M. F.
[pdf]
[bibtex]
@InProceedings{Gui_2018_ECCV,
author = {Gui, Liang-Yan and Wang, Yu-Xiong and Ramanan, Deva and Moura, Jose M. F.},
title = {Few-Shot Human Motion Prediction via Meta-Learning},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Recycle-GAN: Unsupervised Video Retargeting
Bansal, Aayush and Ma, Shugao and Ramanan, Deva and Sheikh, Yaser
[pdf]
[bibtex]
@InProceedings{Bansal_2018_ECCV,
author = {Bansal, Aayush and Ma, Shugao and Ramanan, Deva and Sheikh, Yaser},
title = {Recycle-GAN: Unsupervised Video Retargeting},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Two at Once: Enhancing Learning and Generalization Capacities via IBN-Net
Pan, Xingang and Luo, Ping and Shi, Jianping and Tang, Xiaoou
[pdf]
[bibtex]
@InProceedings{Pan_2018_ECCV,
author = {Pan, Xingang and Luo, Ping and Shi, Jianping and Tang, Xiaoou},
title = {Two at Once: Enhancing Learning and Generalization Capacities via IBN-Net},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Learning Shape Priors for Single-View 3D Completion and Reconstruction
Wu, Jiajun and Zhang, Chengkai and Zhang, Xiuming and Zhang, Zhoutong and Freeman, William T. and Tenenbaum, Joshua B.
[pdf]
[bibtex]
@InProceedings{Wu_2018_ECCV,
author = {Wu, Jiajun and Zhang, Chengkai and Zhang, Xiuming and Zhang, Zhoutong and Freeman, William T. and Tenenbaum, Joshua B.},
title = {Learning Shape Priors for Single-View 3D Completion and Reconstruction},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Unsupervised Image-to-Image Translation with Stacked Cycle-Consistent Adversarial Networks
Li, Minjun and Huang, Haozhi and Ma, Lin and Liu, Wei and Zhang, Tong and Jiang, Yugang
[pdf]
[bibtex]
@InProceedings{Li_2018_ECCV,
author = {Li, Minjun and Huang, Haozhi and Ma, Lin and Liu, Wei and Zhang, Tong and Jiang, Yugang},
title = {Unsupervised Image-to-Image Translation with Stacked Cycle-Consistent Adversarial Networks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Hierarchical Metric Learning and Matching for 2D and 3D Geometric Correspondences
Fathy, Mohammed E. and Tran, Quoc-Huy and Zeeshan Zia, M. and Vernaza, Paul and Chandraker, Manmohan
[pdf]
[bibtex]
@InProceedings{Fathy_2018_ECCV,
author = {Fathy, Mohammed E. and Tran, Quoc-Huy and Zeeshan Zia, M. and Vernaza, Paul and Chandraker, Manmohan},
title = {Hierarchical Metric Learning and Matching for 2D and 3D Geometric Correspondences},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

A Minimal Closed-Form Solution for Multi-Perspective Pose Estimation using Points and Lines
Miraldo, Pedro and Dias, Tiago and Ramalingam, Srikumar
[pdf]
[bibtex]
@InProceedings{Miraldo_2018_ECCV,
author = {Miraldo, Pedro and Dias, Tiago and Ramalingam, Srikumar},
title = {A Minimal Closed-Form Solution for Multi-Perspective Pose Estimation using Points and Lines},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Key-Word-Aware Network for Referring Expression Image Segmentation
Shi, Hengcan and Li, Hongliang and Meng, Fanman and Wu, Qingbo
[pdf]
[bibtex]
@InProceedings{Shi_2018_ECCV,
author = {Shi, Hengcan and Li, Hongliang and Meng, Fanman and Wu, Qingbo},
title = {Key-Word-Aware Network for Referring Expression Image Segmentation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Dynamic Conditional Networks for Few-Shot Learning
Zhao, Fang and Zhao, Jian and Yan, Shuicheng and Feng, Jiashi
[pdf]
[bibtex]
@InProceedings{Zhao_2018_ECCV,
author = {Zhao, Fang and Zhao, Jian and Yan, Shuicheng and Feng, Jiashi},
title = {Dynamic Conditional Networks for Few-Shot Learning},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Burst Image Deblurring Using Permutation Invariant Convolutional Neural Networks
Aittala, Miika and Durand, Fredo
[pdf]
[bibtex]
@InProceedings{Aittala_2018_ECCV,
author = {Aittala, Miika and Durand, Fredo},
title = {Burst Image Deblurring Using Permutation Invariant Convolutional Neural Networks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Learning Type-Aware Embeddings for Fashion Compatibility
Vasileva, Mariya I. and Plummer, Bryan A. and Dusad, Krishna and Rajpal, Shreya and Kumar, Ranjitha and Forsyth, David
[pdf]
[bibtex]
@InProceedings{Vasileva_2018_ECCV,
author = {Vasileva, Mariya I. and Plummer, Bryan A. and Dusad, Krishna and Rajpal, Shreya and Kumar, Ranjitha and Forsyth, David},
title = {Learning Type-Aware Embeddings for Fashion Compatibility},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Learning to Fuse Proposals from Multiple Scanline Optimizations in Semi-Global Matching
Schonberger, Johannes L. and Sinha, Sudipta N. and Pollefeys, Marc
[pdf]
[bibtex]
@InProceedings{Schonberger_2018_ECCV,
author = {Schonberger, Johannes L. and Sinha, Sudipta N. and Pollefeys, Marc},
title = {Learning to Fuse Proposals from Multiple Scanline Optimizations in Semi-Global Matching},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Dividing and Aggregating Network for Multi-view Action Recognition
Wang, Dongang and Ouyang, Wanli and Li, Wen and Xu, Dong
[pdf]
[bibtex]
@InProceedings{Wang_2018_ECCV,
author = {Wang, Dongang and Ouyang, Wanli and Li, Wen and Xu, Dong},
title = {Dividing and Aggregating Network for Multi-view Action Recognition},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Joint & Progressive Learning from High-Dimensional Data for Multi-Label Classification
Hong, Danfeng and Yokoya, Naoto and Xu, Jian and Zhu, Xiaoxiang
[pdf]
[bibtex]
@InProceedings{Hong_2018_ECCV,
author = {Hong, Danfeng and Yokoya, Naoto and Xu, Jian and Zhu, Xiaoxiang},
title = {Joint & Progressive Learning from High-Dimensional Data for Multi-Label Classification},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Image Inpainting for Irregular Holes Using Partial Convolutions
Liu, Guilin and Reda, Fitsum A. and Shih, Kevin J. and Wang, Ting-Chun and Tao, Andrew and Catanzaro, Bryan
[pdf]
[bibtex]
@InProceedings{Liu_2018_ECCV,
author = {Liu, Guilin and Reda, Fitsum A. and Shih, Kevin J. and Wang, Ting-Chun and Tao, Andrew and Catanzaro, Bryan},
title = {Image Inpainting for Irregular Holes Using Partial Convolutions},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

CPlaNet: Enhancing Image Geolocalization by Combinatorial Partitioning of Maps
Hongsuck Seo, Paul and Weyand, Tobias and Sim, Jack and Han, Bohyung
[pdf]
[bibtex]
@InProceedings{Seo_2018_ECCV,
author = {Hongsuck Seo, Paul and Weyand, Tobias and Sim, Jack and Han, Bohyung},
title = {CPlaNet: Enhancing Image Geolocalization by Combinatorial Partitioning of Maps},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation
Chen, Liang-Chieh and Zhu, Yukun and Papandreou, George and Schroff, Florian and Adam, Hartwig
[pdf]
[bibtex]
@InProceedings{Chen_2018_ECCV,
author = {Chen, Liang-Chieh and Zhu, Yukun and Papandreou, George and Schroff, Florian and Adam, Hartwig},
title = {Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Large Scale Urban Scene Modeling from MVS Meshes
Zhu, Lingjie and Shen, Shuhan and Gao, Xiang and Hu, Zhanyi
[pdf]
[bibtex]
@InProceedings{Zhu_2018_ECCV,
author = {Zhu, Lingjie and Shen, Shuhan and Gao, Xiang and Hu, Zhanyi},
title = {Large Scale Urban Scene Modeling from MVS Meshes},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Generalized Loss-Sensitive Adversarial Learning with Manifold Margins
Edraki, Marzieh and Qi, Guo-Jun
[pdf]
[bibtex]
@InProceedings{Edraki_2018_ECCV,
author = {Edraki, Marzieh and Qi, Guo-Jun},
title = {Generalized Loss-Sensitive Adversarial Learning with Manifold Margins},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Learning to Detect and Track Visible and Occluded Body Joints in a Virtual World
Fabbri, Matteo and Lanzi, Fabio and Calderara, Simone and Palazzi, Andrea and Vezzani, Roberto and Cucchiara, Rita
[pdf]
[bibtex]
@InProceedings{Fabbri_2018_ECCV,
author = {Fabbri, Matteo and Lanzi, Fabio and Calderara, Simone and Palazzi, Andrea and Vezzani, Roberto and Cucchiara, Rita},
title = {Learning to Detect and Track Visible and Occluded Body Joints in a Virtual World},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

W-TALC: Weakly-supervised Temporal Activity Localization and Classification
Paul, Sujoy and Roy, Sourya and Roy-Chowdhury, Amit K.
[pdf]
[bibtex]
@InProceedings{Paul_2018_ECCV,
author = {Paul, Sujoy and Roy, Sourya and Roy-Chowdhury, Amit K.},
title = {W-TALC: Weakly-supervised Temporal Activity Localization and Classification},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Viewpoint Estimation---Insights & Model
Divon, Gilad and Tal, Ayellet
[pdf]
[bibtex]
@InProceedings{Divon_2018_ECCV,
author = {Divon, Gilad and Tal, Ayellet},
title = {Viewpoint Estimation---Insights & Model},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Relaxation-Free Deep Hashing via Policy Gradient
Yuan, Xin and Ren, Liangliang and Lu, Jiwen and Zhou, Jie
[pdf]
[bibtex]
@InProceedings{Yuan_2018_ECCV,
author = {Yuan, Xin and Ren, Liangliang and Lu, Jiwen and Zhou, Jie},
title = {Relaxation-Free Deep Hashing via Policy Gradient},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Rolling Shutter Pose and Ego-motion Estimation using Shape-from-Template
Lao, Yizhen and Ait-Aider, Omar and Bartoli, Adrien
[pdf]
[bibtex]
@InProceedings{Lao_2018_ECCV,
author = {Lao, Yizhen and Ait-Aider, Omar and Bartoli, Adrien},
title = {Rolling Shutter Pose and Ego-motion Estimation using Shape-from-Template},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Learning to Capture Light Fields through a Coded Aperture Camera
Inagaki, Yasutaka and Kobayashi, Yuto and Takahashi, Keita and Fujii, Toshiaki and Nagahara, Hajime
[pdf]
[bibtex]
@InProceedings{Inagaki_2018_ECCV,
author = {Inagaki, Yasutaka and Kobayashi, Yuto and Takahashi, Keita and Fujii, Toshiaki and Nagahara, Hajime},
title = {Learning to Capture Light Fields through a Coded Aperture Camera},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Variable Ring Light Imaging: Capturing Transient Subsurface Scattering with An Ordinary Camera
Nishino, Ko and Subpa-asa, Art and Asano, Yuta and Shimano, Mihoko and Sato, Imari
[pdf]
[bibtex]
@InProceedings{Nishino_2018_ECCV,
author = {Nishino, Ko and Subpa-asa, Art and Asano, Yuta and Shimano, Mihoko and Sato, Imari},
title = {Variable Ring Light Imaging: Capturing Transient Subsurface Scattering with An Ordinary Camera},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deep Video Generation, Prediction and Completion of Human Action Sequences
Cai, Haoye and Bai, Chunyan and Tai, Yu-Wing and Tang, Chi-Keung
[pdf]
[bibtex]
@InProceedings{Cai_2018_ECCV,
author = {Cai, Haoye and Bai, Chunyan and Tai, Yu-Wing and Tang, Chi-Keung},
title = {Deep Video Generation, Prediction and Completion of Human Action Sequences},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

PersonLab: Person Pose Estimation and Instance Segmentation with a Bottom-Up, Part-Based, Geometric Embedding Model
Papandreou, George and Zhu, Tyler and Chen, Liang-Chieh and Gidaris, Spyros and Tompson, Jonathan and Murphy, Kevin
[pdf]
[bibtex]
@InProceedings{Papandreou_2018_ECCV,
author = {Papandreou, George and Zhu, Tyler and Chen, Liang-Chieh and Gidaris, Spyros and Tompson, Jonathan and Murphy, Kevin},
title = {PersonLab: Person Pose Estimation and Instance Segmentation with a Bottom-Up, Part-Based, Geometric Embedding Model},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Robust image stitching with multiple registrations
Herrmann, Charles and Wang, Chen and Strong Bowen, Richard and Keyder, Emil and Krainin, Michael and Liu, Ce and Zabih, Ramin
[pdf]
[bibtex]
@InProceedings{Herrmann_2018_ECCV,
author = {Herrmann, Charles and Wang, Chen and Strong Bowen, Richard and Keyder, Emil and Krainin, Michael and Liu, Ce and Zabih, Ramin},
title = {Robust image stitching with multiple registrations},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Learning to Solve Nonlinear Least Squares for Monocular Stereo
Clark, Ronald and Bloesch, Michael and Czarnowski, Jan and Leutenegger, Stefan and Davison, Andrew J.
[pdf]
[bibtex]
@InProceedings{Clark_2018_ECCV,
author = {Clark, Ronald and Bloesch, Michael and Czarnowski, Jan and Leutenegger, Stefan and Davison, Andrew J.},
title = {Learning to Solve Nonlinear Least Squares for Monocular Stereo},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Direct Sparse Odometry With Rolling Shutter
Schubert, David and Demmel, Nikolaus and Usenko, Vladyslav and Stuckler, Jorg and Cremers, Daniel
[pdf]
[bibtex]
@InProceedings{Schubert_2018_ECCV,
author = {Schubert, David and Demmel, Nikolaus and Usenko, Vladyslav and Stuckler, Jorg and Cremers, Daniel},
title = {Direct Sparse Odometry With Rolling Shutter},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

A Zero-Shot Framework for Sketch based Image Retrieval
Kiran Yelamarthi, Sasi and Krishna Reddy, Shiva and Mishra, Ashish and Mittal, Anurag
[pdf]
[bibtex]
@InProceedings{Yelamarthi_2018_ECCV,
author = {Kiran Yelamarthi, Sasi and Krishna Reddy, Shiva and Mishra, Ashish and Mittal, Anurag},
title = {A Zero-Shot Framework for Sketch based Image Retrieval},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Structured Siamese Network for Real-Time Visual Tracking
Zhang, Yunhua and Wang, Lijun and Qi, Jinqing and Wang, Dong and Feng, Mengyang and Lu, Huchuan
[pdf]
[bibtex]
@InProceedings{Zhang_2018_ECCV,
author = {Zhang, Yunhua and Wang, Lijun and Qi, Jinqing and Wang, Dong and Feng, Mengyang and Lu, Huchuan},
title = {Structured Siamese Network for Real-Time Visual Tracking},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Selective Zero-Shot Classification with Augmented Attributes
Song, Jie and Shen, Chengchao and Lei, Jie and Zeng, An-Xiang and Ou, Kairi and Tao, Dacheng and Song, Mingli
[pdf]
[bibtex]
@InProceedings{Song_2018_ECCV,
author = {Song, Jie and Shen, Chengchao and Lei, Jie and Zeng, An-Xiang and Ou, Kairi and Tao, Dacheng and Song, Mingli},
title = {Selective Zero-Shot Classification with Augmented Attributes},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deep Attention Neural Tensor Network for Visual Question Answering
Bai, Yalong and Fu, Jianlong and Zhao, Tiejun and Mei, Tao
[pdf]
[bibtex]
@InProceedings{Bai_2018_ECCV,
author = {Bai, Yalong and Fu, Jianlong and Zhao, Tiejun and Mei, Tao},
title = {Deep Attention Neural Tensor Network for Visual Question Answering},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Zero-Shot Object Detection
Bansal, Ankan and Sikka, Karan and Sharma, Gaurav and Chellappa, Rama and Divakaran, Ajay
[pdf]
[bibtex]
@InProceedings{Bansal_2018_ECCV,
author = {Bansal, Ankan and Sikka, Karan and Sharma, Gaurav and Chellappa, Rama and Divakaran, Ajay},
title = {Zero-Shot Object Detection},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Asynchronous, Photometric Feature Tracking using Events and Frames
Gehrig, Daniel and Rebecq, Henri and Gallego, Guillermo and Scaramuzza, Davide
[pdf]
[bibtex]
@InProceedings{Gehrig_2018_ECCV,
author = {Gehrig, Daniel and Rebecq, Henri and Gallego, Guillermo and Scaramuzza, Davide},
title = {Asynchronous, Photometric Feature Tracking using Events and Frames},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Unsupervised Class-Specific Deblurring
Madam Nimisha, Thekke and Sunil, Kumar and Rajagopalan, A. N.
[pdf]
[bibtex]
@InProceedings{Nimisha_2018_ECCV,
author = {Madam Nimisha, Thekke and Sunil, Kumar and Rajagopalan, A. N.},
title = {Unsupervised Class-Specific Deblurring},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Imagine This! Scripts to Compositions to Videos
Gupta, Tanmay and Schwenk, Dustin and Farhadi, Ali and Hoiem, Derek and Kembhavi, Aniruddha
[pdf]
[bibtex]
@InProceedings{Gupta_2018_ECCV,
author = {Gupta, Tanmay and Schwenk, Dustin and Farhadi, Ali and Hoiem, Derek and Kembhavi, Aniruddha},
title = {Imagine This! Scripts to Compositions to Videos},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deep Structure Inference Network for Facial Action Unit Recognition
Corneanu, Ciprian and Madadi, Meysam and Escalera, Sergio
[pdf]
[bibtex]
@InProceedings{Corneanu_2018_ECCV,
author = {Corneanu, Ciprian and Madadi, Meysam and Escalera, Sergio},
title = {Deep Structure Inference Network for Facial Action Unit Recognition},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Action Anticipation with RBF Kernelized Feature Mapping RNN
Shi, Yuge and Fernando, Basura and Hartley, Richard
[pdf]
[bibtex]
@InProceedings{Shi_2018_ECCV,
author = {Shi, Yuge and Fernando, Basura and Hartley, Richard},
title = {Action Anticipation with RBF Kernelized Feature Mapping RNN},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

CNN-PS: CNN-based Photometric Stereo for General Non-Convex Surfaces
Ikehata, Satoshi
[pdf]
[bibtex]
@InProceedings{Ikehata_2018_ECCV,
author = {Ikehata, Satoshi},
title = {CNN-PS: CNN-based Photometric Stereo for General Non-Convex Surfaces},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Small-scale Pedestrian Detection Based on Topological Line Localization and Temporal Feature Aggregation
Song, Tao and Sun, Leiyu and Xie, Di and Sun, Haiming and Pu, Shiliang
[pdf]
[bibtex]
@InProceedings{Song_2018_ECCV,
author = {Song, Tao and Sun, Leiyu and Xie, Di and Sun, Haiming and Pu, Shiliang},
title = {Small-scale Pedestrian Detection Based on Topological Line Localization and Temporal Feature Aggregation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Summarizing First-Person Videos from Third Persons' Points of View
HO, HSUAN-I and Chiu, Wei-Chen and Frank Wang, Yu-Chiang
[pdf]
[bibtex]
@InProceedings{HO_2018_ECCV,
author = {HO, HSUAN-I and Chiu, Wei-Chen and Frank Wang, Yu-Chiang},
title = {Summarizing First-Person Videos from Third Persons' Points of View},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Snap Angle Prediction for 360° Panoramas
Xiong, Bo and Grauman, Kristen
[pdf]
[bibtex]
@InProceedings{Xiong_2018_ECCV,
author = {Xiong, Bo and Grauman, Kristen},
title = {Snap Angle Prediction for 360° Panoramas},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Zoom-Net: Mining Deep Feature Interactions for Visual Relationship Recognition
Yin, Guojun and Sheng, Lu and Liu, Bin and Yu, Nenghai and Wang, Xiaogang and Shao, Jing and Change Loy, Chen
[pdf]
[bibtex]
@InProceedings{Yin_2018_ECCV,
author = {Yin, Guojun and Sheng, Lu and Liu, Bin and Yu, Nenghai and Wang, Xiaogang and Shao, Jing and Change Loy, Chen},
title = {Zoom-Net: Mining Deep Feature Interactions for Visual Relationship Recognition},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images
Wang, Nanyang and Zhang, Yinda and Li, Zhuwen and Fu, Yanwei and Liu, Wei and Jiang, Yu-Gang
[pdf]
[bibtex]
@InProceedings{Wang_2018_ECCV,
author = {Wang, Nanyang and Zhang, Yinda and Li, Zhuwen and Fu, Yanwei and Liu, Wei and Jiang, Yu-Gang},
title = {Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Factorizable Net: An Efficient Subgraph-based Framework for Scene Graph Generation
Li, Yikang and Ouyang, Wanli and Zhou, Bolei and Shi, Jianping and Zhang, Chao and Wang, Xiaogang
[pdf]
[bibtex]
@InProceedings{Li_2018_ECCV,
author = {Li, Yikang and Ouyang, Wanli and Zhou, Bolei and Shi, Jianping and Zhang, Chao and Wang, Xiaogang},
title = {Factorizable Net: An Efficient Subgraph-based Framework for Scene Graph Generation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Reconstruction-based Pairwise Depth Dataset for Depth Image Enhancement Using CNN
Jeon, Junho and Lee, Seungyong
[pdf]
[bibtex]
@InProceedings{Jeon_2018_ECCV,
author = {Jeon, Junho and Lee, Seungyong},
title = {Reconstruction-based Pairwise Depth Dataset for Depth Image Enhancement Using CNN},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Coded Illumination and Imaging for Fluorescence Based Classification
Asano, Yuta and Meguro, Misaki and Wang, Chao and Lam, Antony and Zheng, Yinqiang and Okabe, Takahiro and Sato, Imari
[pdf]
[bibtex]
@InProceedings{Asano_2018_ECCV,
author = {Asano, Yuta and Meguro, Misaki and Wang, Chao and Lam, Antony and Zheng, Yinqiang and Okabe, Takahiro and Sato, Imari},
title = {Coded Illumination and Imaging for Fluorescence Based Classification},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Multi-view to Novel view: Synthesizing novel views with Self-Learned Confidence
Sun, Shao-Hua and Huh, Minyoung and Liao, Yuan-Hong and Zhang, Ning and Lim, Joseph J.
[pdf]
[bibtex]
@InProceedings{Sun_2018_ECCV,
author = {Sun, Shao-Hua and Huh, Minyoung and Liao, Yuan-Hong and Zhang, Ning and Lim, Joseph J.},
title = {Multi-view to Novel view: Synthesizing novel views with Self-Learned Confidence},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Robust Anchor Embedding for Unsupervised Video Person Re-Identification in the Wild
Ye, Mang and Lan, Xiangyuan and Yuen, Pong C.
[pdf]
[bibtex]
@InProceedings{Ye_2018_ECCV,
author = {Ye, Mang and Lan, Xiangyuan and Yuen, Pong C.},
title = {Robust Anchor Embedding for Unsupervised Video Person Re-Identification in the Wild},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Training Binary Weight Networks via Semi-Binary Decomposition
Hu, Qinghao and Li, Gang and Wang, Peisong and Zhang, Yifan and Cheng, Jian
[pdf]
[bibtex]
@InProceedings{Hu_2018_ECCV,
author = {Hu, Qinghao and Li, Gang and Wang, Peisong and Zhang, Yifan and Cheng, Jian},
title = {Training Binary Weight Networks via Semi-Binary Decomposition},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Hand Pose Estimation via Latent 2.5D Heatmap Regression
Iqbal, Umar and Molchanov, Pavlo and Breuel Juergen Gall, Thomas and Kautz, Jan
[pdf]
[bibtex]
@InProceedings{Iqbal_2018_ECCV,
author = {Iqbal, Umar and Molchanov, Pavlo and Breuel Juergen Gall, Thomas and Kautz, Jan},
title = {Hand Pose Estimation via Latent 2.5D Heatmap Regression},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks
Zhang, Dongqing and Yang, Jiaolong and Ye, Dongqiangzi and Hua, Gang
[pdf]
[bibtex]
@InProceedings{Zhang_2018_ECCV,
author = {Zhang, Dongqing and Yang, Jiaolong and Ye, Dongqiangzi and Hua, Gang},
title = {LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deep Randomized Ensembles for Metric Learning
Xuan, Hong and Souvenir, Richard and Pless, Robert
[pdf]
[bibtex]
@InProceedings{Xuan_2018_ECCV,
author = {Xuan, Hong and Souvenir, Richard and Pless, Robert},
title = {Deep Randomized Ensembles for Metric Learning},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

ECO: Efficient Convolutional Network for Online Video Understanding
Zolfaghari, Mohammadreza and Singh, Kamaljeet and Brox, Thomas
[pdf]
[bibtex]
@InProceedings{Zolfaghari_2018_ECCV,
author = {Zolfaghari, Mohammadreza and Singh, Kamaljeet and Brox, Thomas},
title = {ECO: Efficient Convolutional Network for Online Video Understanding},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Proxy Clouds for Live RGB-D Stream Processing and Consolidation
Kaiser, Adrien and Alonso Ybanez Zepeda, Jose and Boubekeur, Tamy
[pdf]
[bibtex]
@InProceedings{Kaiser_2018_ECCV,
author = {Kaiser, Adrien and Alonso Ybanez Zepeda, Jose and Boubekeur, Tamy},
title = {Proxy Clouds for Live RGB-D Stream Processing and Consolidation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Neural Graph Matching Networks for Fewshot 3D Action Recognition
Guo, Michelle and Chou, Edward and Huang, De-An and Song, Shuran and Yeung, Serena and Fei-Fei, Li
[pdf]
[bibtex]
@InProceedings{Guo_2018_ECCV,
author = {Guo, Michelle and Chou, Edward and Huang, De-An and Song, Shuran and Yeung, Serena and Fei-Fei, Li},
title = {Neural Graph Matching Networks for Fewshot 3D Action Recognition},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Stereo relative pose from line and point feature triplets
Vakhitov, Alexander and Lempitsky, Victor and Zheng, Yinqiang
[pdf]
[bibtex]
@InProceedings{Vakhitov_2018_ECCV,
author = {Vakhitov, Alexander and Lempitsky, Victor and Zheng, Yinqiang},
title = {Stereo relative pose from line and point feature triplets},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

A-Contrario Horizon-First Vanishing Point Detection Using Second-Order Grouping Laws
Simon, Gilles and Fond, Antoine and Berger, Marie-Odile
[pdf]
[bibtex]
@InProceedings{Simon_2018_ECCV,
author = {Simon, Gilles and Fond, Antoine and Berger, Marie-Odile},
title = {A-Contrario Horizon-First Vanishing Point Detection Using Second-Order Grouping Laws},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Learning to Zoom: a Saliency-Based Sampling Layer for Neural Networks
Recasens, Adria and Kellnhofer, Petr and Stent, Simon and Matusik, Wojciech and Torralba, Antonio
[pdf]
[bibtex]
@InProceedings{Recasens_2018_ECCV,
author = {Recasens, Adria and Kellnhofer, Petr and Stent, Simon and Matusik, Wojciech and Torralba, Antonio},
title = {Learning to Zoom: a Saliency-Based Sampling Layer for Neural Networks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Improving Deep Visual Representation for Person Re-identification by Global and Local Image-language Association
Chen, Dapeng and Li, Hongsheng and Liu, Xihui and Shen, Yantao and Shao, Jing and Yuan, Zejian and Wang, Xiaogang
[pdf]
[bibtex]
@InProceedings{Chen_2018_ECCV,
author = {Chen, Dapeng and Li, Hongsheng and Liu, Xihui and Shen, Yantao and Shao, Jing and Yuan, Zejian and Wang, Xiaogang},
title = {Improving Deep Visual Representation for Person Re-identification by Global and Local Image-language Association},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Less is More: Picking Informative Frames for Video Captioning
Chen, Yangyu and Wang, Shuhui and Zhang, Weigang and Huang, Qingming
[pdf]
[bibtex]
@InProceedings{Chen_2018_ECCV,
author = {Chen, Yangyu and Wang, Shuhui and Zhang, Weigang and Huang, Qingming},
title = {Less is More: Picking Informative Frames for Video Captioning},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

BodyNet: Volumetric Inference of 3D Human Body Shapes
Varol, Gul and Ceylan, Duygu and Russell, Bryan and Yang, Jimei and Yumer, Ersin and Laptev, Ivan and Schmid, Cordelia
[pdf]
[bibtex]
@InProceedings{Varol_2018_ECCV,
author = {Varol, Gul and Ceylan, Duygu and Russell, Bryan and Yang, Jimei and Yumer, Ersin and Laptev, Ivan and Schmid, Cordelia},
title = {BodyNet: Volumetric Inference of 3D Human Body Shapes},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Towards Human-Level License Plate Recognition
Zhuang, Jiafan and Hou, Saihui and Wang, Zilei and Zha, Zheng-Jun
[pdf]
[bibtex]
@InProceedings{Zhuang_2018_ECCV,
author = {Zhuang, Jiafan and Hou, Saihui and Wang, Zilei and Zha, Zheng-Jun},
title = {Towards Human-Level License Plate Recognition},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

A Dataset for Lane Instance Segmentation in Urban Environments
Roberts, Brook and Kaltwang, Sebastian and Samangooei, Sina and Pender-Bare, Mark and Tertikas, Konstantinos and Redford, John
[pdf]
[bibtex]
@InProceedings{Roberts_2018_ECCV,
author = {Roberts, Brook and Kaltwang, Sebastian and Samangooei, Sina and Pender-Bare, Mark and Tertikas, Konstantinos and Redford, John},
title = {A Dataset for Lane Instance Segmentation in Urban Environments},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

DeepIM: Deep Iterative Matching for 6D Pose Estimation
Li, Yi and Wang, Gu and Ji, Xiangyang and Xiang, Yu and Fox, Dieter
[pdf]
[bibtex]
@InProceedings{Li_2018_ECCV,
author = {Li, Yi and Wang, Gu and Ji, Xiangyang and Xiang, Yu and Fox, Dieter},
title = {DeepIM: Deep Iterative Matching for 6D Pose Estimation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Riemannian Walk for Incremental Learning: Understanding Forgetting and Intransigence
Chaudhry, Arslan and Dokania, Puneet K. and Ajanthan, Thalaiyasingam and Torr, Philip H. S.
[pdf]
[bibtex]
@InProceedings{Chaudhry_2018_ECCV,
author = {Chaudhry, Arslan and Dokania, Puneet K. and Ajanthan, Thalaiyasingam and Torr, Philip H. S.},
title = {Riemannian Walk for Incremental Learning: Understanding Forgetting and Intransigence},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Meta-Tracker: Fast and Robust Online Adaptation for Visual Object Trackers
Park, Eunbyung and Berg, Alexander C.
[pdf]
[bibtex]
@InProceedings{Park_2018_ECCV,
author = {Park, Eunbyung and Berg, Alexander C.},
title = {Meta-Tracker: Fast and Robust Online Adaptation for Visual Object Trackers},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation
Mehta, Sachin and Rastegari, Mohammad and Caspi, Anat and Shapiro, Linda and Hajishirzi, Hannaneh
[pdf]
[bibtex]
@InProceedings{Mehta_2018_ECCV,
author = {Mehta, Sachin and Rastegari, Mohammad and Caspi, Anat and Shapiro, Linda and Hajishirzi, Hannaneh},
title = {ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Wasserstein Divergence for GANs
Wu, Jiqing and Huang, Zhiwu and Thoma, Janine and Acharya, Dinesh and Van Gool, Luc
[pdf]
[bibtex]
@InProceedings{Wu_2018_ECCV,
author = {Wu, Jiqing and Huang, Zhiwu and Thoma, Janine and Acharya, Dinesh and Van Gool, Luc},
title = {Wasserstein Divergence for GANs},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Evaluating Capability of Deep Neural Networks for Image Classification via Information Plane
Cheng, Hao and Lian, Dongze and Gao, Shenghua and Geng, Yanlin
[pdf]
[bibtex]
@InProceedings{Cheng_2018_ECCV,
author = {Cheng, Hao and Lian, Dongze and Gao, Shenghua and Geng, Yanlin},
title = {Evaluating Capability of Deep Neural Networks for Image Classification via Information Plane},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

C-WSL: Count-guided Weakly Supervised Localization
Gao, Mingfei and Li, Ang and Yu, Ruichi and Morariu, Vlad I. and Davis, Larry S.
[pdf]
[bibtex]
@InProceedings{Gao_2018_ECCV,
author = {Gao, Mingfei and Li, Ang and Yu, Ruichi and Morariu, Vlad I. and Davis, Larry S.},
title = {C-WSL: Count-guided Weakly Supervised Localization},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Goal-Oriented Visual Question Generation via Intermediate Rewards
Zhang, Junjie and Wu, Qi and Shen, Chunhua and Zhang, Jian and Lu, Jianfeng and van den Hengel, Anton
[pdf]
[bibtex]
@InProceedings{Zhang_2018_ECCV,
author = {Zhang, Junjie and Wu, Qi and Shen, Chunhua and Zhang, Jian and Lu, Jianfeng and van den Hengel, Anton},
title = {Goal-Oriented Visual Question Generation via Intermediate Rewards},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

ICNet for Real-Time Semantic Segmentation on High-Resolution Images
Zhao, Hengshuang and Qi, Xiaojuan and Shen, Xiaoyong and Shi, Jianping and Jia, Jiaya
[pdf]
[bibtex]
@InProceedings{Zhao_2018_ECCV,
author = {Zhao, Hengshuang and Qi, Xiaojuan and Shen, Xiaoyong and Shi, Jianping and Jia, Jiaya},
title = {ICNet for Real-Time Semantic Segmentation on High-Resolution Images},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Multi-Fiber Networks for Video Recognition
Chen, Yunpeng and Kalantidis, Yannis and Li, Jianshu and Yan, Shuicheng and Feng, Jiashi
[pdf]
[bibtex]
@InProceedings{Chen_2018_ECCV,
author = {Chen, Yunpeng and Kalantidis, Yannis and Li, Jianshu and Yan, Shuicheng and Feng, Jiashi},
title = {Multi-Fiber Networks for Video Recognition},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

TS2C: Tight Box Mining with Surrounding Segmentation Context for Weakly Supervised Object Detection
Wei, Yunchao and Shen, Zhiqiang and Cheng, Bowen and Shi, Honghui and Xiong, Jinjun and Feng, Jiashi and Huang, Thomas
[pdf]
[bibtex]
@InProceedings{Wei_2018_ECCV,
author = {Wei, Yunchao and Shen, Zhiqiang and Cheng, Bowen and Shi, Honghui and Xiong, Jinjun and Feng, Jiashi and Huang, Thomas},
title = {TS2C: Tight Box Mining with Surrounding Segmentation Context for Weakly Supervised Object Detection},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

PARN: Pyramidal Affine Regression Networks for Dense Semantic Correspondence
Jeon, Sangryul and Kim, Seungryong and Min, Dongbo and Sohn, Kwanghoon
[pdf]
[bibtex]
@InProceedings{Jeon_2018_ECCV,
author = {Jeon, Sangryul and Kim, Seungryong and Min, Dongbo and Sohn, Kwanghoon},
title = {PARN: Pyramidal Affine Regression Networks for Dense Semantic Correspondence},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Super-Identity Convolutional Neural Network for Face Hallucination
Zhang, Kaipeng and Zhang, Zhanpeng and Cheng, Chia-Wen and Hsu, Winston H. and Qiao, Yu and Liu, Wei and Zhang, Tong
[pdf]
[bibtex]
@InProceedings{Zhang_2018_ECCV,
author = {Zhang, Kaipeng and Zhang, Zhanpeng and Cheng, Chia-Wen and Hsu, Winston H. and Qiao, Yu and Liu, Wei and Zhang, Tong},
title = {Super-Identity Convolutional Neural Network for Face Hallucination},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Look Deeper into Depth: Monocular Depth Estimation with Semantic Booster and Attention-Driven Loss
Jiao, Jianbo and Cao, Ying and Song, Yibing and Lau, Rynson
[pdf]
[bibtex]
@InProceedings{Jiao_2018_ECCV,
author = {Jiao, Jianbo and Cao, Ying and Song, Yibing and Lau, Rynson},
title = {Look Deeper into Depth: Monocular Depth Estimation with Semantic Booster and Attention-Driven Loss},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification
Xie, Saining and Sun, Chen and Huang, Jonathan and Tu, Zhuowen and Murphy, Kevin
[pdf]
[bibtex]
@InProceedings{Xie_2018_ECCV,
author = {Xie, Saining and Sun, Chen and Huang, Jonathan and Tu, Zhuowen and Murphy, Kevin},
title = {Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Domain Adaptation through Synthesis for Unsupervised Person Re-identification
Bak, Slawomir and Carr, Peter and Lalonde, Jean-Francois
[pdf]
[bibtex]
@InProceedings{Bak_2018_ECCV,
author = {Bak, Slawomir and Carr, Peter and Lalonde, Jean-Francois},
title = {Domain Adaptation through Synthesis for Unsupervised Person Re-identification},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Learning to Predict Crisp Boundaries
Deng, Ruoxi and Shen, Chunhua and Liu, Shengjun and Wang, Huibing and Liu, Xinru
[pdf]
[bibtex]
@InProceedings{Deng_2018_ECCV,
author = {Deng, Ruoxi and Shen, Chunhua and Liu, Shengjun and Wang, Huibing and Liu, Xinru},
title = {Learning to Predict Crisp Boundaries},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Learning Efficient Single-stage Pedestrian Detectors by Asymptotic Localization Fitting
Liu, Wei and Liao, Shengcai and Hu, Weidong and Liang, Xuezhi and Chen, Xiao
[pdf]
[bibtex]
@InProceedings{Liu_2018_ECCV,
author = {Liu, Wei and Liao, Shengcai and Hu, Weidong and Liang, Xuezhi and Chen, Xiao},
title = {Learning Efficient Single-stage Pedestrian Detectors by Asymptotic Localization Fitting},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Attention-based Ensemble for Deep Metric Learning
Kim, Wonsik and Goyal, Bhavya and Chawla, Kunal and Lee, Jungmin and Kwon, Keunjoo
[pdf]
[bibtex]
@InProceedings{Kim_2018_ECCV,
author = {Kim, Wonsik and Goyal, Bhavya and Chawla, Kunal and Lee, Jungmin and Kwon, Keunjoo},
title = {Attention-based Ensemble for Deep Metric Learning},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

3DFeat-Net: Weakly Supervised Local 3D Features for Point Cloud Registration
Jian Yew, Zi and Hee Lee, Gim
[pdf]
[bibtex]
@InProceedings{Yew_2018_ECCV,
author = {Jian Yew, Zi and Hee Lee, Gim},
title = {3DFeat-Net: Weakly Supervised Local 3D Features for Point Cloud Registration},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

DCAN: Dual Channel-wise Alignment Networks for Unsupervised Scene Adaptation
Wu, Zuxuan and Han, Xintong and Lin, Yen-Liang and Gokhan Uzunbas, Mustafa and Goldstein, Tom and Nam Lim, Ser and Davis, Larry S.
[pdf]
[bibtex]
@InProceedings{Wu_2018_ECCV,
author = {Wu, Zuxuan and Han, Xintong and Lin, Yen-Liang and Gokhan Uzunbas, Mustafa and Goldstein, Tom and Nam Lim, Ser and Davis, Larry S.},
title = {DCAN: Dual Channel-wise Alignment Networks for Unsupervised Scene Adaptation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

NNEval: Neural Network based Evaluation Metric for Image Captioning
Sharif, Naeha and White, Lyndon and Bennamoun, Mohammed and Afaq Ali Shah, Syed
[pdf]
[bibtex]
@InProceedings{Sharif_2018_ECCV,
author = {Sharif, Naeha and White, Lyndon and Bennamoun, Mohammed and Afaq Ali Shah, Syed},
title = {NNEval: Neural Network based Evaluation Metric for Image Captioning},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Learning to Segment via Cut-and-Paste
Remez, Tal and Huang, Jonathan and Brown, Matthew
[pdf]
[bibtex]
@InProceedings{Remez_2018_ECCV,
author = {Remez, Tal and Huang, Jonathan and Brown, Matthew},
title = {Learning to Segment via Cut-and-Paste},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Real-Time Hair Rendering using Sequential Adversarial Networks
Wei, Lingyu and Hu, Liwen and Kim, Vladimir and Yumer, Ersin and Li, Hao
[pdf]
[bibtex]
@InProceedings{Wei_2018_ECCV,
author = {Wei, Lingyu and Hu, Liwen and Kim, Vladimir and Yumer, Ersin and Li, Hao},
title = {Real-Time Hair Rendering using Sequential Adversarial Networks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Learning Human-Object Interactions by Graph Parsing Neural Networks
Qi, Siyuan and Wang, Wenguan and Jia, Baoxiong and Shen, Jianbing and Zhu, Song-Chun
[pdf]
[bibtex]
@InProceedings{Qi_2018_ECCV,
author = {Qi, Siyuan and Wang, Wenguan and Jia, Baoxiong and Shen, Jianbing and Zhu, Song-Chun},
title = {Learning Human-Object Interactions by Graph Parsing Neural Networks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Occlusion-aware R-CNN: Detecting Pedestrians in a Crowd
Zhang, Shifeng and Wen, Longyin and Bian, Xiao and Lei, Zhen and Li, Stan Z.
[pdf]
[bibtex]
@InProceedings{Zhang_2018_ECCV,
author = {Zhang, Shifeng and Wen, Longyin and Bian, Xiao and Lei, Zhen and Li, Stan Z.},
title = {Occlusion-aware R-CNN: Detecting Pedestrians in a Crowd},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Linear RGB-D SLAM for Planar Environments
Kim, Pyojin and Coltin, Brian and Jin Kim, H.
[pdf]
[bibtex]
@InProceedings{Kim_2018_ECCV,
author = {Kim, Pyojin and Coltin, Brian and Jin Kim, H.},
title = {Linear RGB-D SLAM for Planar Environments},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

NAM: Non-Adversarial Unsupervised Domain Mapping
Hoshen, Yedid and Wolf, Lior
[pdf]
[bibtex]
@InProceedings{Hoshen_2018_ECCV,
author = {Hoshen, Yedid and Wolf, Lior},
title = {NAM: Non-Adversarial Unsupervised Domain Mapping},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

CrossNet: An End-to-end Reference-based Super Resolution Network using Cross-scale Warping
Zheng, Haitian and Ji, Mengqi and Wang, Haoqian and Liu, Yebin and Fang, Lu
[pdf]
[bibtex]
@InProceedings{Zheng_2018_ECCV,
author = {Zheng, Haitian and Ji, Mengqi and Wang, Haoqian and Liu, Yebin and Fang, Lu},
title = {CrossNet: An End-to-end Reference-based Super Resolution Network using Cross-scale Warping},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Video Object Segmentation with Joint Re-identification and Attention-Aware Mask Propagation
Li, Xiaoxiao and Change Loy, Chen
[pdf]
[bibtex]
@InProceedings{Li_2018_ECCV,
author = {Li, Xiaoxiao and Change Loy, Chen},
title = {Video Object Segmentation with Joint Re-identification and Attention-Aware Mask Propagation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Layer-structured 3D Scene Inference via View Synthesis
Tulsiani, Shubham and Tucker, Richard and Snavely, Noah
[pdf]
[bibtex]
@InProceedings{Tulsiani_2018_ECCV,
author = {Tulsiani, Shubham and Tucker, Richard and Snavely, Noah},
title = {Layer-structured 3D Scene Inference via View Synthesis},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Facial Expression Recognition with Inconsistently Annotated Datasets
Zeng, Jiabei and Shan, Shiguang and Chen, Xilin
[pdf]
[bibtex]
@InProceedings{Zeng_2018_ECCV,
author = {Zeng, Jiabei and Shan, Shiguang and Chen, Xilin},
title = {Facial Expression Recognition with Inconsistently Annotated Datasets},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Exploiting Vector Fields for Geometric Rectification of Distorted Document Images
MENG, Gaofeng and SU, Yuanqi and WU, Ying and XIANG, Shiming and PAN, Chunhong
[pdf]
[bibtex]
@InProceedings{MENG_2018_ECCV,
author = {MENG, Gaofeng and SU, Yuanqi and WU, Ying and XIANG, Shiming and PAN, Chunhong},
title = {Exploiting Vector Fields for Geometric Rectification of Distorted Document Images},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

A+D Net: Training a Shadow Detector with Adversarial Shadow Attenuation
Le, Hieu and Yago Vicente, Tomas F. and Nguyen, Vu and Hoai, Minh and Samaras, Dimitris
[pdf]
[bibtex]
@InProceedings{Le_2018_ECCV,
author = {Le, Hieu and Yago Vicente, Tomas F. and Nguyen, Vu and Hoai, Minh and Samaras, Dimitris},
title = {A+D Net: Training a Shadow Detector with Adversarial Shadow Attenuation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Lip Movements Generation at a Glance
Chen, Lele and Li, Zhiheng and K Maddox, Ross and Duan, Zhiyao and Xu, Chenliang
[pdf]
[bibtex]
@InProceedings{Chen_2018_ECCV,
author = {Chen, Lele and Li, Zhiheng and K Maddox, Ross and Duan, Zhiyao and Xu, Chenliang},
title = {Lip Movements Generation at a Glance},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Domain transfer through deep activation matching
Huang, Haoshuo and Huang, Qixing and Krahenbuhl, Philipp
[pdf]
[bibtex]
@InProceedings{Huang_2018_ECCV,
author = {Huang, Haoshuo and Huang, Qixing and Krahenbuhl, Philipp},
title = {Domain transfer through deep activation matching},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Geolocation Estimation of Photos using a Hierarchical Model and Scene Classification
Muller-Budack, Eric and Pustu-Iren, Kader and Ewerth, Ralph
[pdf]
[bibtex]
@InProceedings{Muller-Budack_2018_ECCV,
author = {Muller-Budack, Eric and Pustu-Iren, Kader and Ewerth, Ralph},
title = {Geolocation Estimation of Photos using a Hierarchical Model and Scene Classification},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Temporal Relational Reasoning in Videos
Zhou, Bolei and Andonian, Alex and Oliva, Aude and Torralba, Antonio
[pdf]
[bibtex]
@InProceedings{Zhou_2018_ECCV,
author = {Zhou, Bolei and Andonian, Alex and Oliva, Aude and Torralba, Antonio},
title = {Temporal Relational Reasoning in Videos},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Leveraging Motion Priors in Videos for Improving Human Segmentation
Chen, Yu-Ting and Chang, Wen-Yen and Lu, Hai-Lun and Wu, Tingfan and Sun, Min
[pdf]
[bibtex]
@InProceedings{Chen_2018_ECCV,
author = {Chen, Yu-Ting and Chang, Wen-Yen and Lu, Hai-Lun and Wu, Tingfan and Sun, Min},
title = {Leveraging Motion Priors in Videos for Improving Human Segmentation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Sequential Clique Optimization for Video Object Segmentation
Jun Koh, Yeong and Lee, Young-Yoon and Kim, Chang-Su
[pdf]
[bibtex]
@InProceedings{Koh_2018_ECCV,
author = {Jun Koh, Yeong and Lee, Young-Yoon and Kim, Chang-Su},
title = {Sequential Clique Optimization for Video Object Segmentation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

3D Scene Flow from 4D Light Field Gradients
Ma, Sizhuo and Smith, Brandon M. and Gupta, Mohit
[pdf]
[bibtex]
@InProceedings{Ma_2018_ECCV,
author = {Ma, Sizhuo and Smith, Brandon M. and Gupta, Mohit},
title = {3D Scene Flow from 4D Light Field Gradients},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Accelerating Dynamic Programs via Nested Benders Decomposition with Application to Multi-Person Pose Estimation
Wang, Shaofei and Ihler, Alexander and Kording, Konrad and Yarkony, Julian
[pdf]
[bibtex]
@InProceedings{Wang_2018_ECCV,
author = {Wang, Shaofei and Ihler, Alexander and Kording, Konrad and Yarkony, Julian},
title = {Accelerating Dynamic Programs via Nested Benders Decomposition with Application to Multi-Person Pose Estimation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Multi-scale Residual Network for Image Super-Resolution
Li, Juncheng and Fang, Faming and Mei, Kangfu and Zhang, Guixu
[pdf]
[bibtex]
@InProceedings{Li_2018_ECCV,
author = {Li, Juncheng and Fang, Faming and Mei, Kangfu and Zhang, Guixu},
title = {Multi-scale Residual Network for Image Super-Resolution},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Efficient Global Point Cloud Registration by Matching Rotation Invariant Features Through Translation Search
Liu, Yinlong and Wang, Chen and Song, Zhijian and Wang, Manning
[pdf]
[bibtex]
@InProceedings{Liu_2018_ECCV,
author = {Liu, Yinlong and Wang, Chen and Song, Zhijian and Wang, Manning},
title = {Efficient Global Point Cloud Registration by Matching Rotation Invariant Features Through Translation Search},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Stroke Controllable Fast Style Transfer with Adaptive Receptive Fields
Jing, Yongcheng and Liu, Yang and Yang, Yezhou and Feng, Zunlei and Yu, Yizhou and Tao, Dacheng and Song, Mingli
[pdf]
[bibtex]
@InProceedings{Jing_2018_ECCV,
author = {Jing, Yongcheng and Liu, Yang and Yang, Yezhou and Feng, Zunlei and Yu, Yizhou and Tao, Dacheng and Song, Mingli},
title = {Stroke Controllable Fast Style Transfer with Adaptive Receptive Fields},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

A Modulation Module for Multi-task Learning with Applications in Image Retrieval
Zhao, Xiangyun and Li, Haoxiang and Shen, Xiaohui and Liang, Xiaodan and Wu, Ying
[pdf]
[bibtex]
@InProceedings{Zhao_2018_ECCV,
author = {Zhao, Xiangyun and Li, Haoxiang and Shen, Xiaohui and Liang, Xiaodan and Wu, Ying},
title = {A Modulation Module for Multi-task Learning with Applications in Image Retrieval},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deforming Autoencoders: Unsupervised Disentangling of Shape and Appearance
Shu, Zhixin and Sahasrabudhe, Mihir and Alp Guler, Riza and Samaras, Dimitris and Paragios, Nikos and Kokkinos, Iasonas
[pdf]
[bibtex]
@InProceedings{Shu_2018_ECCV,
author = {Shu, Zhixin and Sahasrabudhe, Mihir and Alp Guler, Riza and Samaras, Dimitris and Paragios, Nikos and Kokkinos, Iasonas},
title = {Deforming Autoencoders: Unsupervised Disentangling of Shape and Appearance},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

ShapeCodes: Self-Supervised Feature Learning by Lifting Views to Viewgrids
Jayaraman, Dinesh and Gao, Ruohan and Grauman, Kristen
[pdf]
[bibtex]
@InProceedings{Jayaraman_2018_ECCV,
author = {Jayaraman, Dinesh and Gao, Ruohan and Grauman, Kristen},
title = {ShapeCodes: Self-Supervised Feature Learning by Lifting Views to Viewgrids},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Triplet Loss in Siamese Network for Object Tracking
Dong, Xingping and Shen, Jianbing
[pdf]
[bibtex]
@InProceedings{Dong_2018_ECCV,
author = {Dong, Xingping and Shen, Jianbing},
title = {Triplet Loss in Siamese Network for Object Tracking},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Person Re-identification with Deep Similarity-Guided Graph Neural Network
Shen, Yantao and Li, Hongsheng and Yi, Shuai and Chen, Dapeng and Wang, Xiaogang
[pdf]
[bibtex]
@InProceedings{Shen_2018_ECCV,
author = {Shen, Yantao and Li, Hongsheng and Yi, Shuai and Chen, Dapeng and Wang, Xiaogang},
title = {Person Re-identification with Deep Similarity-Guided Graph Neural Network},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

VSO: Visual Semantic Odometry
Lianos, Konstantinos-Nektarios and Schonberger, Johannes L. and Pollefeys, Marc and Sattler, Torsten
[pdf]
[bibtex]
@InProceedings{Lianos_2018_ECCV,
author = {Lianos, Konstantinos-Nektarios and Schonberger, Johannes L. and Pollefeys, Marc and Sattler, Torsten},
title = {VSO: Visual Semantic Odometry},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Volumetric performance capture from minimal camera viewpoints
Gilbert, Andrew and Volino, Marco and Collomosse, John and Hilton, Adrian
[pdf]
[bibtex]
@InProceedings{Gilbert_2018_ECCV,
author = {Gilbert, Andrew and Volino, Marco and Collomosse, John and Hilton, Adrian},
title = {Volumetric performance capture from minimal camera viewpoints},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Videos as Space-Time Region Graphs
Wang, Xiaolong and Gupta, Abhinav
[pdf]
[bibtex]
@InProceedings{Wang_2018_ECCV,
author = {Wang, Xiaolong and Gupta, Abhinav},
title = {Videos as Space-Time Region Graphs},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Faces as Lighting Probes via Unsupervised Deep Highlight Extraction
Yi, Renjiao and Zhu, Chenyang and Tan, Ping and Lin, Stephen
[pdf]
[bibtex]
@InProceedings{Yi_2018_ECCV,
author = {Yi, Renjiao and Zhu, Chenyang and Tan, Ping and Lin, Stephen},
title = {Faces as Lighting Probes via Unsupervised Deep Highlight Extraction},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Unsupervised holistic image generation from key local patches
Lee, Donghoon and Yun, Sangdoo and Choi, Sungjoon and Yoo, Hwiyeon and Yang, Ming-Hsuan and Oh, Songhwai
[pdf]
[bibtex]
@InProceedings{Lee_2018_ECCV,
author = {Lee, Donghoon and Yun, Sangdoo and Choi, Sungjoon and Yoo, Hwiyeon and Yang, Ming-Hsuan and Oh, Songhwai},
title = {Unsupervised holistic image generation from key local patches},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Visual Text Correction
Mazaheri, Amir and Shah, Mubarak
[pdf]
[bibtex]
@InProceedings{Mazaheri_2018_ECCV,
author = {Mazaheri, Amir and Shah, Mubarak},
title = {Visual Text Correction},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

ELEGANT: Exchanging Latent Encodings with GAN for Transferring Multiple Face Attributes
Xiao, Taihong and Hong, Jiapeng and Ma, Jinwen
[pdf]
[bibtex]
@InProceedings{Xiao_2018_ECCV,
author = {Xiao, Taihong and Hong, Jiapeng and Ma, Jinwen},
title = {ELEGANT: Exchanging Latent Encodings with GAN for Transferring Multiple Face Attributes},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Coloring with Words: Guiding Image Colorization Through Text-based Palette Generation
Bahng, Hyojin and Yoo, Seungjoo and Cho, Wonwoong and Keetae Park, David and Wu, Ziming and Ma, Xiaojuan and Choo, Jaegul
[pdf]
[bibtex]
@InProceedings{Bahng_2018_ECCV,
author = {Bahng, Hyojin and Yoo, Seungjoo and Cho, Wonwoong and Keetae Park, David and Wu, Ziming and Ma, Xiaojuan and Choo, Jaegul},
title = {Coloring with Words: Guiding Image Colorization Through Text-based Palette Generation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Teaching Machines to Understand Baseball Games: Large-Scale Baseball Video Database for Multiple Video Understanding Tasks
Shim, Minho and Hwi Kim, Young and Kim, Kyungmin and Joo Kim, Seon
[pdf]
[bibtex]
@InProceedings{Shim_2018_ECCV,
author = {Shim, Minho and Hwi Kim, Young and Kim, Kyungmin and Joo Kim, Seon},
title = {Teaching Machines to Understand Baseball Games: Large-Scale Baseball Video Database for Multiple Video Understanding Tasks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Into the Twilight Zone: Depth Estimation using Joint Structure-Stereo Optimization
Sharma, Aashish and Cheong, Loong-Fah
[pdf]
[bibtex]
@InProceedings{Sharma_2018_ECCV,
author = {Sharma, Aashish and Cheong, Loong-Fah},
title = {Into the Twilight Zone: Depth Estimation using Joint Structure-Stereo Optimization},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Learning 3D Shapes as Multi-Layered Height-maps using 2D Convolutional Networks
Sarkar, Kripasindhu and Hampiholi, Basavaraj and Varanasi, Kiran and Stricker, Didier
[pdf]
[bibtex]
@InProceedings{Sarkar_2018_ECCV,
author = {Sarkar, Kripasindhu and Hampiholi, Basavaraj and Varanasi, Kiran and Stricker, Didier},
title = {Learning 3D Shapes as Multi-Layered Height-maps using 2D Convolutional Networks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Coreset-Based Neural Network Compression
Dubey, Abhimanyu and Chatterjee, Moitreya and Ahuja, Narendra
[pdf]
[bibtex]
@InProceedings{Dubey_2018_ECCV,
author = {Dubey, Abhimanyu and Chatterjee, Moitreya and Ahuja, Narendra},
title = {Coreset-Based Neural Network Compression},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Variational Wasserstein Clustering
Mi, Liang and Zhang, Wen and Gu, Xianfeng and Wang, Yalin
[pdf]
[bibtex]
@InProceedings{Mi_2018_ECCV,
author = {Mi, Liang and Zhang, Wen and Gu, Xianfeng and Wang, Yalin},
title = {Variational Wasserstein Clustering},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Joint Person Segmentation and Identification in Synchronized First- and Third-person Videos
Xu, Mingze and Fan, Chenyou and Wang, Yuchen and Ryoo, Michael S. and Crandall, David J.
[pdf]
[bibtex]
@InProceedings{Xu_2018_ECCV,
author = {Xu, Mingze and Fan, Chenyou and Wang, Yuchen and Ryoo, Michael S. and Crandall, David J.},
title = {Joint Person Segmentation and Identification in Synchronized First- and Third-person Videos},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Zero-shot keyword spotting for visual speech recognition in-the-wild
Stafylakis, Themos and Tzimiropoulos, Georgios
[pdf]
[bibtex]
@InProceedings{Stafylakis_2018_ECCV,
author = {Stafylakis, Themos and Tzimiropoulos, Georgios},
title = {Zero-shot keyword spotting for visual speech recognition in-the-wild},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

ContextVP: Fully Context-Aware Video Prediction
Byeon, Wonmin and Wang, Qin and Kumar Srivastava, Rupesh and Koumoutsakos, Petros
[pdf]
[bibtex]
@InProceedings{Byeon_2018_ECCV,
author = {Byeon, Wonmin and Wang, Qin and Kumar Srivastava, Rupesh and Koumoutsakos, Petros},
title = {ContextVP: Fully Context-Aware Video Prediction},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Open Set Domain Adaptation by Backpropagation
Saito, Kuniaki and Yamamoto, Shohei and Ushiku, Yoshitaka and Harada, Tatsuya
[pdf]
[bibtex]
@InProceedings{Saito_2018_ECCV,
author = {Saito, Kuniaki and Yamamoto, Shohei and Ushiku, Yoshitaka and Harada, Tatsuya},
title = {Open Set Domain Adaptation by Backpropagation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Learn-to-Score: Efficient 3D Scene Exploration by Predicting View Utility
Hepp, Benjamin and Dey, Debadeepta and Sinha, Sudipta N. and Kapoor, Ashish and Joshi, Neel and Hilliges, Otmar
[pdf]
[bibtex]
@InProceedings{Hepp_2018_ECCV,
author = {Hepp, Benjamin and Dey, Debadeepta and Sinha, Sudipta N. and Kapoor, Ashish and Joshi, Neel and Hilliges, Otmar},
title = {Learn-to-Score: Efficient 3D Scene Exploration by Predicting View Utility},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Accurate Scene Text Detection through Border Semantics Awareness and Bootstrapping
Xue, Chuhui and Lu, Shijian and Zhan, Fangneng
[pdf]
[bibtex]
@InProceedings{Xue_2018_ECCV,
author = {Xue, Chuhui and Lu, Shijian and Zhan, Fangneng},
title = {Accurate Scene Text Detection through Border Semantics Awareness and Bootstrapping},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deep Image Demosaicking using a Cascade of Convolutional Residual Denoising Networks
Kokkinos, Filippos and Lefkimmiatis, Stamatios
[pdf]
[bibtex]
@InProceedings{Kokkinos_2018_ECCV,
author = {Kokkinos, Filippos and Lefkimmiatis, Stamatios},
title = {Deep Image Demosaicking using a Cascade of Convolutional Residual Denoising Networks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Good Line Cutting: towards Accurate Pose Tracking of Line-assisted VO/VSLAM
Zhao, Yipu and Vela, Patricio A.
[pdf]
[bibtex]
@InProceedings{Zhao_2018_ECCV,
author = {Zhao, Yipu and Vela, Patricio A.},
title = {Good Line Cutting: towards Accurate Pose Tracking of Line-assisted VO/VSLAM},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Constraint-Aware Deep Neural Network Compression
Chen, Changan and Tung, Frederick and Vedula, Naveen and Mori, Greg
[pdf]
[bibtex]
@InProceedings{Chen_2018_ECCV,
author = {Chen, Changan and Tung, Frederick and Vedula, Naveen and Mori, Greg},
title = {Constraint-Aware Deep Neural Network Compression},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Boosted Attention: Leveraging Human Attention for Image Captioning
Chen, Shi and Zhao, Qi
[pdf]
[bibtex]
@InProceedings{Chen_2018_ECCV,
author = {Chen, Shi and Zhao, Qi},
title = {Boosted Attention: Leveraging Human Attention for Image Captioning},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Understanding Perceptual and Conceptual Fluency at a Large Scale
Hu, Shengli and Borji, Ali
[pdf]
[bibtex]
@InProceedings{Hu_2018_ECCV,
author = {Hu, Shengli and Borji, Ali},
title = {Understanding Perceptual and Conceptual Fluency at a Large Scale},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

MaskConnect: Connectivity Learning by Gradient Descent
Ahmed, Karim and Torresani, Lorenzo
[pdf]
[bibtex]
@InProceedings{Ahmed_2018_ECCV,
author = {Ahmed, Karim and Torresani, Lorenzo},
title = {MaskConnect: Connectivity Learning by Gradient Descent},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Exploring Visual Relationship for Image Captioning
Yao, Ting and Pan, Yingwei and Li, Yehao and Mei, Tao
[pdf]
[bibtex]
@InProceedings{Yao_2018_ECCV,
author = {Yao, Ting and Pan, Yingwei and Li, Yehao and Mei, Tao},
title = {Exploring Visual Relationship for Image Captioning},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Diagnosing Error in Temporal Action Detectors
Alwassel, Humam and Caba Heilbron, Fabian and Escorcia, Victor and Ghanem, Bernard
[pdf]
[bibtex]
@InProceedings{Alwassel_2018_ECCV,
author = {Alwassel, Humam and Caba Heilbron, Fabian and Escorcia, Victor and Ghanem, Bernard},
title = {Diagnosing Error in Temporal Action Detectors},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Efficient Semantic Scene Completion Network with Spatial Group Convolution
Zhang, Jiahui and Zhao, Hao and Yao, Anbang and Chen, Yurong and Zhang, Li and Liao, Hongen
[pdf]
[bibtex]
@InProceedings{Zhang_2018_ECCV,
author = {Zhang, Jiahui and Zhao, Hao and Yao, Anbang and Chen, Yurong and Zhang, Li and Liao, Hongen},
title = {Efficient Semantic Scene Completion Network with Spatial Group Convolution},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Task-driven Webpage Saliency
Zheng, Quanlong and Jiao, Jianbo and Cao, Ying and Lau, Rynson W.H.
[pdf]
[bibtex]
@InProceedings{Zheng_2018_ECCV,
author = {Zheng, Quanlong and Jiao, Jianbo and Cao, Ying and Lau, Rynson W.H.},
title = {Task-driven Webpage Saliency},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Multi-Scale Context Intertwining for Semantic Segmentation
Lin, Di and Ji, Yuanfeng and Lischinski, Dani and Cohen-Or, Daniel and Huang, Hui
[pdf]
[bibtex]
@InProceedings{Lin_2018_ECCV,
author = {Lin, Di and Ji, Yuanfeng and Lischinski, Dani and Cohen-Or, Daniel and Huang, Hui},
title = {Multi-Scale Context Intertwining for Semantic Segmentation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Multiple-gaze geometry: Inferring novel 3D locations from gazes observed in monocular video
Brau, Ernesto and Guan, Jinyan and Jeffries, Tanya and Barnard, Kobus
[pdf]
[bibtex]
@InProceedings{Brau_2018_ECCV,
author = {Brau, Ernesto and Guan, Jinyan and Jeffries, Tanya and Barnard, Kobus},
title = {Multiple-gaze geometry: Inferring novel 3D locations from gazes observed in monocular video},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

HybridFusion: Real-Time Performance Capture Using a Single Depth Sensor and Sparse IMUs
Zheng, Zerong and Yu, Tao and Li, Hao and Guo, Kaiwen and Dai, Qionghai and Fang, Lu and Liu, Yebin
[pdf]
[bibtex]
@InProceedings{Zheng_2018_ECCV,
author = {Zheng, Zerong and Yu, Tao and Li, Hao and Guo, Kaiwen and Dai, Qionghai and Fang, Lu and Liu, Yebin},
title = {HybridFusion: Real-Time Performance Capture Using a Single Depth Sensor and Sparse IMUs},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Macro-Micro Adversarial Network for Human Parsing
Luo, Yawei and Zheng, Zhedong and Zheng, Liang and Guan, Tao and Yu, Junqing and Yang, Yi
[pdf]
[bibtex]
@InProceedings{Luo_2018_ECCV,
author = {Luo, Yawei and Zheng, Zhedong and Zheng, Liang and Guan, Tao and Yu, Junqing and Yang, Yi},
title = {Macro-Micro Adversarial Network for Human Parsing},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Pivot Correlational Neural Network for Multimodal Video Categorization
Kang, Sunghun and Kim, Junyeong and Choi, Hyunsoo and Kim, Sungjin and Yoo, Chang D.
[pdf]
[bibtex]
@InProceedings{Kang_2018_ECCV,
author = {Kang, Sunghun and Kim, Junyeong and Choi, Hyunsoo and Kim, Sungjin and Yoo, Chang D.},
title = {Pivot Correlational Neural Network for Multimodal Video Categorization},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Semantically Aware Urban 3D Reconstruction with Plane-Based Regularization
Holzmann, Thomas and Maurer, Michael and Fraundorfer, Friedrich and Bischof, Horst
[pdf]
[bibtex]
@InProceedings{Holzmann_2018_ECCV,
author = {Holzmann, Thomas and Maurer, Michael and Fraundorfer, Friedrich and Bischof, Horst},
title = {Semantically Aware Urban 3D Reconstruction with Plane-Based Regularization},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

AugGAN: Cross Domain Adaptation with GAN-based Data Augmentation
Huang, Sheng-Wei and Lin, Che-Tsung and Chen, Shu-Ping and Wu, Yen-Yi and Hsu, Po-Hao and Lai, Shang-Hong
[pdf]
[bibtex]
@InProceedings{Huang_2018_ECCV,
author = {Huang, Sheng-Wei and Lin, Che-Tsung and Chen, Shu-Ping and Wu, Yen-Yi and Hsu, Po-Hao and Lai, Shang-Hong},
title = {AugGAN: Cross Domain Adaptation with GAN-based Data Augmentation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Unsupervised Domain Adaptation for Semantic Segmentation via Class-Balanced Self-Training
Zou, Yang and Yu, Zhiding and Vijaya Kumar, B.V.K. and Wang, Jinsong
[pdf]
[bibtex]
@InProceedings{Zou_2018_ECCV,
author = {Zou, Yang and Yu, Zhiding and Vijaya Kumar, B.V.K. and Wang, Jinsong},
title = {Unsupervised Domain Adaptation for Semantic Segmentation via Class-Balanced Self-Training},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Fictitious GAN: Training GANs with Historical Models
Ge, Hao and Xia, Yin and Chen, Xu and Berry, Randall and Wu, Ying
[pdf]
[bibtex]
@InProceedings{Ge_2018_ECCV,
author = {Ge, Hao and Xia, Yin and Chen, Xu and Berry, Randall and Wu, Ying},
title = {Fictitious GAN: Training GANs with Historical Models},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Perturbation Robust Representations of Topological Persistence Diagrams
Som, Anirudh and Thopalli, Kowshik and Natesan Ramamurthy, Karthikeyan and Venkataraman, Vinay and Shukla, Ankita and Turaga, Pavan
[pdf]
[bibtex]
@InProceedings{Som_2018_ECCV,
author = {Som, Anirudh and Thopalli, Kowshik and Natesan Ramamurthy, Karthikeyan and Venkataraman, Vinay and Shukla, Ankita and Turaga, Pavan},
title = {Perturbation Robust Representations of Topological Persistence Diagrams},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

DPP-Net: Device-aware Progressive Search for Pareto-optimal Neural Architectures
Dong, Jin-Dong and Cheng, An-Chieh and Juan, Da-Cheng and Wei, Wei and Sun, Min
[pdf]
[bibtex]
@InProceedings{Dong_2018_ECCV,
author = {Dong, Jin-Dong and Cheng, An-Chieh and Juan, Da-Cheng and Wei, Wei and Sun, Min},
title = {DPP-Net: Device-aware Progressive Search for Pareto-optimal Neural Architectures},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

SketchyScene: Richly-Annotated Scene Sketches
Zou, Changqing and Yu, Qian and Du, Ruofei and Mo, Haoran and Song, Yi-Zhe and Xiang, Tao and Gao, Chengying and Chen, Baoquan and Zhang, Hao
[pdf]
[bibtex]
@InProceedings{Zou_2018_ECCV,
author = {Zou, Changqing and Yu, Qian and Du, Ruofei and Mo, Haoran and Song, Yi-Zhe and Xiang, Tao and Gao, Chengying and Chen, Baoquan and Zhang, Hao},
title = {SketchyScene: Richly-Annotated Scene Sketches},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Contour Knowledge Transfer for Salient Object Detection
Li, Xin and Yang, Fan and Cheng, Hong and Liu, Wei and Shen, Dinggang
[pdf]
[bibtex]
@InProceedings{Li_2018_ECCV,
author = {Li, Xin and Yang, Fan and Cheng, Hong and Liu, Wei and Shen, Dinggang},
title = {Contour Knowledge Transfer for Salient Object Detection},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Scenes-Objects-Actions: A Multi-Task, Multi-Label Video Dataset
Ray, Jamie and Wang, Heng and Tran, Du and Wang, Yufei and Feiszli, Matt and Torresani, Lorenzo and Paluri, Manohar
[pdf]
[bibtex]
@InProceedings{Ray_2018_ECCV,
author = {Ray, Jamie and Wang, Heng and Tran, Du and Wang, Yufei and Feiszli, Matt and Torresani, Lorenzo and Paluri, Manohar},
title = {Scenes-Objects-Actions: A Multi-Task, Multi-Label Video Dataset},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Saliency Detection in 360° Videos
Zhang, Ziheng and Xu, Yanyu and Yu, Jingyi and Gao, Shenghua
[pdf]
[bibtex]
@InProceedings{Zhang_2018_ECCV,
author = {Zhang, Ziheng and Xu, Yanyu and Yu, Jingyi and Gao, Shenghua},
title = {Saliency Detection in 360° Videos},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

DetNet: Design Backbone for Object Detection
Li, Zeming and Peng, Chao and Yu, Gang and Zhang, Xiangyu and Deng, Yangdong and Sun, Jian
[pdf]
[bibtex]
@InProceedings{Li_2018_ECCV,
author = {Li, Zeming and Peng, Chao and Yu, Gang and Zhang, Xiangyu and Deng, Yangdong and Sun, Jian},
title = {DetNet: Design Backbone for Object Detection},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Facial Dynamics Interpreter Network: What are the Important Relations between Local Dynamics for Facial Trait Estimation?
Tae Kim, Seong and Man Ro, Yong
[pdf]
[bibtex]
@InProceedings{Kim_2018_ECCV,
author = {Tae Kim, Seong and Man Ro, Yong},
title = {Facial Dynamics Interpreter Network: What are the Important Relations between Local Dynamics for Facial Trait Estimation?},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Video Object Segmentation by Learning Location-Sensitive Embeddings
Ci, Hai and Wang, Chunyu and Wang, Yizhou
[pdf]
[bibtex]
@InProceedings{Ci_2018_ECCV,
author = {Ci, Hai and Wang, Chunyu and Wang, Yizhou},
title = {Video Object Segmentation by Learning Location-Sensitive Embeddings},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Transferable Adversarial Perturbations
Zhou, Wen and Hou, Xin and Chen, Yongjun and Tang, Mengyun and Huang, Xiangqi and Gan, Xiang and Yang, Yong
[pdf]
[bibtex]
@InProceedings{Zhou_2018_ECCV,
author = {Zhou, Wen and Hou, Xin and Chen, Yongjun and Tang, Mengyun and Huang, Xiangqi and Gan, Xiang and Yang, Yong},
title = {Transferable Adversarial Perturbations},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

A Segmentation-aware Deep Fusion Network for Compressed Sensing MRI
Fan, Zhiwen and Sun, Liyan and Ding, Xinghao and Huang, Yue and Cai, Congbo and Paisley, John
[pdf]
[bibtex]
@InProceedings{Fan_2018_ECCV,
author = {Fan, Zhiwen and Sun, Liyan and Ding, Xinghao and Huang, Yue and Cai, Congbo and Paisley, John},
title = {A Segmentation-aware Deep Fusion Network for Compressed Sensing MRI},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

GANimation: Anatomically-aware Facial Animation from a Single Image
Pumarola, Albert and Agudo, Antonio and Martinez, Aleix M. and Sanfeliu, Alberto and Moreno-Noguer, Francesc
[pdf]
[bibtex]
@InProceedings{Pumarola_2018_ECCV,
author = {Pumarola, Albert and Agudo, Antonio and Martinez, Aleix M. and Sanfeliu, Alberto and Moreno-Noguer, Francesc},
title = {GANimation: Anatomically-aware Facial Animation from a Single Image},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Graph R-CNN for Scene Graph Generation
Yang, Jianwei and Lu, Jiasen and Lee, Stefan and Batra, Dhruv and Parikh, Devi
[pdf]
[bibtex]
@InProceedings{Yang_2018_ECCV,
author = {Yang, Jianwei and Lu, Jiasen and Lee, Stefan and Batra, Dhruv and Parikh, Devi},
title = {Graph R-CNN for Scene Graph Generation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Interpretable Basis Decomposition for Visual Explanation
Zhou, Bolei and Sun, Yiyou and Bau, David and Torralba, Antonio
[pdf]
[bibtex]
@InProceedings{Zhou_2018_ECCV,
author = {Zhou, Bolei and Sun, Yiyou and Bau, David and Torralba, Antonio},
title = {Interpretable Basis Decomposition for Visual Explanation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Reinforced Temporal Attention and Split-Rate Transfer for Depth-Based Person Re-Identification
Karianakis, Nikolaos and Liu, Zicheng and Chen, Yinpeng and Soatto, Stefano
[pdf]
[bibtex]
@InProceedings{Karianakis_2018_ECCV,
author = {Karianakis, Nikolaos and Liu, Zicheng and Chen, Yinpeng and Soatto, Stefano},
title = {Reinforced Temporal Attention and Split-Rate Transfer for Depth-Based Person Re-Identification},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

ArticulatedFusion: Real-time Reconstruction of Motion, Geometry and Segmentation Using a Single Depth Camera
Li, Chao and Zhao, Zheheng and Guo, Xiaohu
[pdf]
[bibtex]
@InProceedings{Li_2018_ECCV,
author = {Li, Chao and Zhao, Zheheng and Guo, Xiaohu},
title = {ArticulatedFusion: Real-time Reconstruction of Motion, Geometry and Segmentation Using a Single Depth Camera},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deep Metric Learning with Hierarchical Triplet Loss
Ge, Weifeng
[pdf]
[bibtex]
@InProceedings{Ge_2018_ECCV,
author = {Ge, Weifeng},
title = {Deep Metric Learning with Hierarchical Triplet Loss},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deep Directional Statistics: Pose Estimation with Uncertainty Quantification
Prokudin, Sergey and Gehler, Peter and Nowozin, Sebastian
[pdf]
[bibtex]
@InProceedings{Prokudin_2018_ECCV,
author = {Prokudin, Sergey and Gehler, Peter and Nowozin, Sebastian},
title = {Deep Directional Statistics: Pose Estimation with Uncertainty Quantification},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Semantic Match Consistency for Long-Term Visual Localization
Toft, Carl and Stenborg, Erik and Hammarstrand, Lars and Brynte, Lucas and Pollefeys, Marc and Sattler, Torsten and Kahl, Fredrik
[pdf]
[bibtex]
@InProceedings{Toft_2018_ECCV,
author = {Toft, Carl and Stenborg, Erik and Hammarstrand, Lars and Brynte, Lucas and Pollefeys, Marc and Sattler, Torsten and Kahl, Fredrik},
title = {Semantic Match Consistency for Long-Term Visual Localization},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Decouple Learning for Parameterized Image Operators
Fan, Qingnan and Chen, Dongdong and Yuan, Lu and Hua, Gang and Yu, Nenghai and Chen, Baoquan
[pdf]
[bibtex]
@InProceedings{Fan_2018_ECCV,
author = {Fan, Qingnan and Chen, Dongdong and Yuan, Lu and Hua, Gang and Yu, Nenghai and Chen, Baoquan},
title = {Decouple Learning for Parameterized Image Operators},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Structural Consistency and Controllability for Diverse Colorization
Messaoud, Safa and Forsyth, David and Schwing, Alexander G.
[pdf]
[bibtex]
@InProceedings{Messaoud_2018_ECCV,
author = {Messaoud, Safa and Forsyth, David and Schwing, Alexander G.},
title = {Structural Consistency and Controllability for Diverse Colorization},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Deep Component Analysis via Alternating Direction Neural Networks
Murdock, Calvin and Chang, MingFang and Lucey, Simon
[pdf]
[bibtex]
@InProceedings{Murdock_2018_ECCV,
author = {Murdock, Calvin and Chang, MingFang and Lucey, Simon},
title = {Deep Component Analysis via Alternating Direction Neural Networks},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Maximum Margin Metric Learning Over Discriminative Nullspace for Person Re-identification
M Feroz Ali, T and Chaudhuri, Subhasis
[pdf]
[bibtex]
@InProceedings{Ali_2018_ECCV,
author = {M Feroz Ali, T and Chaudhuri, Subhasis},
title = {Maximum Margin Metric Learning Over Discriminative Nullspace for Person Re-identification},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Pose-Normalized Image Generation for Person Re-identification
Qian, Xuelin and Fu, Yanwei and Xiang, Tao and Wang, Wenxuan and Qiu, Jie and Wu, Yang and Jiang, Yu-Gang and Xue, Xiangyang
[pdf]
[bibtex]
@InProceedings{Qian_2018_ECCV,
author = {Qian, Xuelin and Fu, Yanwei and Xiang, Tao and Wang, Wenxuan and Qiu, Jie and Wu, Yang and Jiang, Yu-Gang and Xue, Xiangyang},
title = {Pose-Normalized Image Generation for Person Re-identification},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Cross-Modal Hamming Hashing
Cao, Yue and Liu, Bin and Long, Mingsheng and Wang, Jianmin
[pdf]
[bibtex]
@InProceedings{Cao_2018_ECCV,
author = {Cao, Yue and Liu, Bin and Long, Mingsheng and Wang, Jianmin},
title = {Cross-Modal Hamming Hashing},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Modeling Visual Context is Key to Augmenting Object Detection Datasets
Dvornik, Nikita and Mairal, Julien and Schmid, Cordelia
[pdf]
[bibtex]
@InProceedings{Dvornik_2018_ECCV,
author = {Dvornik, Nikita and Mairal, Julien and Schmid, Cordelia},
title = {Modeling Visual Context is Key to Augmenting Object Detection Datasets},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

ReenactGAN: Learning to Reenact Faces via Boundary Transfer
Wu, Wayne and Zhang, Yunxuan and Li, Cheng and Qian, Chen and Change Loy, Chen
[pdf]
[bibtex]
@InProceedings{Wu_2018_ECCV,
author = {Wu, Wayne and Zhang, Yunxuan and Li, Cheng and Qian, Chen and Change Loy, Chen},
title = {ReenactGAN: Learning to Reenact Faces via Boundary Transfer},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Universal Sketch Perceptual Grouping
Li, Ke and Pang, Kaiyue and Song, Jifei and Song, Yi-Zhe and Xiang, Tao and Hospedales, Timothy M. and Zhang, Honggang
[pdf]
[bibtex]
@InProceedings{Li_2018_ECCV,
author = {Li, Ke and Pang, Kaiyue and Song, Jifei and Song, Yi-Zhe and Xiang, Tao and Hospedales, Timothy M. and Zhang, Honggang},
title = {Universal Sketch Perceptual Grouping},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Compositional Learning for Human Object Interaction
Kato, Keizo and Li, Yin and Gupta, Abhinav
[pdf]
[bibtex]
@InProceedings{Kato_2018_ECCV,
author = {Kato, Keizo and Li, Yin and Gupta, Abhinav},
title = {Compositional Learning for Human Object Interaction},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}