Self-supervised Learning and Domain Adaptation for Visual Analysis

Self-supervised Learning and Domain Adaptation for Visual Analysis PDF Author: Kevin Lin
Publisher:
ISBN:
Category :
Languages : en
Pages : 131

Get Book Here

Book Description
Supervised training with deep Convolutional Neural Networks (CNNs) have achieved great success in various visual recognition tasks. However, supervised training with deep CNNs requires large amount of well-annotated data. Data labeling, especially for large-scale image dataset, is very expensive. How to learn an effective model without the need of training data labeling has become an important problem for many applications. A promising solution is to create a learning protocol for the neural networks, so that the neural networks can learn to teach itself without manual labels. This technique is referred as the self-supervised learning, which has recently drawn an increasing attention for improving the learning performance. In this thesis, we first present our work on learning binary descriptors for fast image retrieval without manual labeling. We observe that images with the same category should have similar visual textures, and these similar textures are usually invariant to shift, scale and rotation. Thus, we could generate similar texture patch pairs automatically for training CNNs by shifting, scaling, and rotating image patches. Based on the observation, we design a training protocol for deep CNNs, which automatically generates pair-wise pseudo labels describing the similarity between the given two images. The proposed method performs more favorably than the baselines on different tasks including patch matching, image retrieval, and object recognition. In the second part of this thesis, we turn our focus to the task of human-centric analysis applications, and present our work on learning multi-person part segmentation without human labeling. Our proposed complementary learning technique learns a neural network model for multi-person part segmentation using a synthetic dataset and a real dataset. We observe that real and synthetic humans share a common skeleton structure. During learning, the proposed model extracts human skeletons which effectively bridges the synthetic and real domains. Without using human-annotated part segmentation labels, the resultant model works well on real world images. Our method outperforms the state-of-the-art approaches on multiple public datasets. Then, we discuss our work on accelerating multi-person pose estimation using a proposed concatenated pyramid network. We observe that each image may contain an unknown number of people that can occur at any scale or position. This makes fast multi-person pose estimation very challenging. Different from the earlier deep learning approaches that extract image features by using a series of convolutions, our proposed method extracts image features from each convolution layer in parallel, which better captures image features in different scales and improve the performance of human pose estimation. Our proposed method eliminates the need of multi-scale inference and multi-stage detection, and the proposed method is many times faster than the state-of-the-art approaches, while achieving better accuracy on the public datasets. Next, we present our work on 3D human mesh construction from a single image. We propose a novel approach to learn the human mesh representation without any ground truth mesh. This is made possible by introducing two new terms into the loss function of a graph convolutional neural network (Graph CNN). The first term is the Laplacian prior that acts as a regularizer on the mesh construction. The second term is the part segmentation loss that forces the projected region of the constructed mesh to match the part segmentation. Experimental results on multiple public datasets show that without using 3D ground truth meshes, the proposed approach outperforms the previous state-of-the-art approaches that require 3D ground truth meshes for training. Finally, we summarize our completed works and discuss the future research directions.

Self-supervised Learning and Domain Adaptation for Visual Analysis

Self-supervised Learning and Domain Adaptation for Visual Analysis PDF Author: Kevin Lin
Publisher:
ISBN:
Category :
Languages : en
Pages : 131

Get Book Here

Book Description
Supervised training with deep Convolutional Neural Networks (CNNs) have achieved great success in various visual recognition tasks. However, supervised training with deep CNNs requires large amount of well-annotated data. Data labeling, especially for large-scale image dataset, is very expensive. How to learn an effective model without the need of training data labeling has become an important problem for many applications. A promising solution is to create a learning protocol for the neural networks, so that the neural networks can learn to teach itself without manual labels. This technique is referred as the self-supervised learning, which has recently drawn an increasing attention for improving the learning performance. In this thesis, we first present our work on learning binary descriptors for fast image retrieval without manual labeling. We observe that images with the same category should have similar visual textures, and these similar textures are usually invariant to shift, scale and rotation. Thus, we could generate similar texture patch pairs automatically for training CNNs by shifting, scaling, and rotating image patches. Based on the observation, we design a training protocol for deep CNNs, which automatically generates pair-wise pseudo labels describing the similarity between the given two images. The proposed method performs more favorably than the baselines on different tasks including patch matching, image retrieval, and object recognition. In the second part of this thesis, we turn our focus to the task of human-centric analysis applications, and present our work on learning multi-person part segmentation without human labeling. Our proposed complementary learning technique learns a neural network model for multi-person part segmentation using a synthetic dataset and a real dataset. We observe that real and synthetic humans share a common skeleton structure. During learning, the proposed model extracts human skeletons which effectively bridges the synthetic and real domains. Without using human-annotated part segmentation labels, the resultant model works well on real world images. Our method outperforms the state-of-the-art approaches on multiple public datasets. Then, we discuss our work on accelerating multi-person pose estimation using a proposed concatenated pyramid network. We observe that each image may contain an unknown number of people that can occur at any scale or position. This makes fast multi-person pose estimation very challenging. Different from the earlier deep learning approaches that extract image features by using a series of convolutions, our proposed method extracts image features from each convolution layer in parallel, which better captures image features in different scales and improve the performance of human pose estimation. Our proposed method eliminates the need of multi-scale inference and multi-stage detection, and the proposed method is many times faster than the state-of-the-art approaches, while achieving better accuracy on the public datasets. Next, we present our work on 3D human mesh construction from a single image. We propose a novel approach to learn the human mesh representation without any ground truth mesh. This is made possible by introducing two new terms into the loss function of a graph convolutional neural network (Graph CNN). The first term is the Laplacian prior that acts as a regularizer on the mesh construction. The second term is the part segmentation loss that forces the projected region of the constructed mesh to match the part segmentation. Experimental results on multiple public datasets show that without using 3D ground truth meshes, the proposed approach outperforms the previous state-of-the-art approaches that require 3D ground truth meshes for training. Finally, we summarize our completed works and discuss the future research directions.

Visual Domain Adaptation in the Deep Learning Era

Visual Domain Adaptation in the Deep Learning Era PDF Author: Gabriela Csurka
Publisher: Springer Nature
ISBN: 3031791754
Category : Computers
Languages : en
Pages : 182

Get Book Here

Book Description
Solving problems with deep neural networks typically relies on massive amounts of labeled training data to achieve high performance. While in many situations huge volumes of unlabeled data can be and often are generated and available, the cost of acquiring data labels remains high. Transfer learning (TL), and in particular domain adaptation (DA), has emerged as an effective solution to overcome the burden of annotation, exploiting the unlabeled data available from the target domain together with labeled data or pre-trained models from similar, yet different source domains. The aim of this book is to provide an overview of such DA/TL methods applied to computer vision, a field whose popularity has increased significantly in the last few years. We set the stage by revisiting the theoretical background and some of the historical shallow methods before discussing and comparing different domain adaptation strategies that exploit deep architectures for visual recognition. We introduce the space of self-training-based methods that draw inspiration from the related fields of deep semi-supervised and self-supervised learning in solving the deep domain adaptation. Going beyond the classic domain adaptation problem, we then explore the rich space of problem settings that arise when applying domain adaptation in practice such as partial or open-set DA, where source and target data categories do not fully overlap, continuous DA where the target data comes as a stream, and so on. We next consider the least restrictive setting of domain generalization (DG), as an extreme case where neither labeled nor unlabeled target data are available during training. Finally, we close by considering the emerging area of learning-to-learn and how it can be applied to further improve existing approaches to cross domain learning problems such as DA and DG.

Visual Domain Adaptation in the Deep Learning Era

Visual Domain Adaptation in the Deep Learning Era PDF Author: Gabriela Csurka
Publisher: Springer
ISBN: 9783031791802
Category : Computers
Languages : en
Pages : 168

Get Book Here

Book Description
Solving problems with deep neural networks typically relies on massive amounts of labeled training data to achieve high performance. While in many situations huge volumes of unlabeled data can be and often are generated and available, the cost of acquiring data labels remains high. Transfer learning (TL), and in particular domain adaptation (DA), has emerged as an effective solution to overcome the burden of annotation, exploiting the unlabeled data available from the target domain together with labeled data or pre-trained models from similar, yet different source domains. The aim of this book is to provide an overview of such DA/TL methods applied to computer vision, a field whose popularity has increased significantly in the last few years. We set the stage by revisiting the theoretical background and some of the historical shallow methods before discussing and comparing different domain adaptation strategies that exploit deep architectures for visual recognition. We introduce the space of self-training-based methods that draw inspiration from the related fields of deep semi-supervised and self-supervised learning in solving the deep domain adaptation. Going beyond the classic domain adaptation problem, we then explore the rich space of problem settings that arise when applying domain adaptation in practice such as partial or open-set DA, where source and target data categories do not fully overlap, continuous DA where the target data comes as a stream, and so on. We next consider the least restrictive setting of domain generalization (DG), as an extreme case where neither labeled nor unlabeled target data are available during training. Finally, we close by considering the emerging area of learning-to-learn and how it can be applied to further improve existing approaches to cross domain learning problems such as DA and DG.

Domain Adaptation for Visual Understanding

Domain Adaptation for Visual Understanding PDF Author: Richa Singh
Publisher: Springer Nature
ISBN: 3030306712
Category : Computers
Languages : en
Pages : 144

Get Book Here

Book Description
This unique volume reviews the latest advances in domain adaptation in the training of machine learning algorithms for visual understanding, offering valuable insights from an international selection of experts in the field. The text presents a diverse selection of novel techniques, covering applications of object recognition, face recognition, and action and event recognition. Topics and features: reviews the domain adaptation-based machine learning algorithms available for visual understanding, and provides a deep metric learning approach; introduces a novel unsupervised method for image-to-image translation, and a video segment retrieval model that utilizes ensemble learning; proposes a unique way to determine which dataset is most useful in the base training, in order to improve the transferability of deep neural networks; describes a quantitative method for estimating the discrepancy between the source and target data to enhance image classification performance; presents a technique for multi-modal fusion that enhances facial action recognition, and a framework for intuition learning in domain adaptation; examines an original interpolation-based approach to address the issue of tracking model degradation in correlation filter-based methods. This authoritative work will serve as an invaluable reference for researchers and practitioners interested in machine learning-based visual recognition and understanding.

Unsupervised Domain Adaptation

Unsupervised Domain Adaptation PDF Author: Jingjing Li
Publisher: Springer Nature
ISBN: 9819710251
Category :
Languages : en
Pages : 234

Get Book Here

Book Description


Domain Adaptation in Computer Vision with Deep Learning

Domain Adaptation in Computer Vision with Deep Learning PDF Author: Hemanth Venkateswara
Publisher: Springer Nature
ISBN: 3030455297
Category : Computers
Languages : en
Pages : 256

Get Book Here

Book Description
This book provides a survey of deep learning approaches to domain adaptation in computer vision. It gives the reader an overview of the state-of-the-art research in deep learning based domain adaptation. This book also discusses the various approaches to deep learning based domain adaptation in recent years. It outlines the importance of domain adaptation for the advancement of computer vision, consolidates the research in the area and provides the reader with promising directions for future research in domain adaptation. Divided into four parts, the first part of this book begins with an introduction to domain adaptation, which outlines the problem statement, the role of domain adaptation and the motivation for research in this area. It includes a chapter outlining pre-deep learning era domain adaptation techniques. The second part of this book highlights feature alignment based approaches to domain adaptation. The third part of this book outlines image alignment procedures for domain adaptation. The final section of this book presents novel directions for research in domain adaptation. This book targets researchers working in artificial intelligence, machine learning, deep learning and computer vision. Industry professionals and entrepreneurs seeking to adopt deep learning into their applications will also be interested in this book.

Learning and Leveraging Shared Domain Semantics to Counteract Visual Domain Shifts

Learning and Leveraging Shared Domain Semantics to Counteract Visual Domain Shifts PDF Author: Róger Bermúdez Chacón
Publisher:
ISBN:
Category :
Languages : en
Pages : 90

Get Book Here

Book Description
Mots-clés de l'auteur: Domain Adaptation ; Transfer Learning ; Multiple-Instance Learning ; Self-Supervised Learning ; Neural Architecture Search ; Biomedical Imaging.

Person Re-Identification

Person Re-Identification PDF Author: Shaogang Gong
Publisher: Springer Science & Business Media
ISBN: 144716296X
Category : Computers
Languages : en
Pages : 446

Get Book Here

Book Description
The first book of its kind dedicated to the challenge of person re-identification, this text provides an in-depth, multidisciplinary discussion of recent developments and state-of-the-art methods. Features: introduces examples of robust feature representations, reviews salient feature weighting and selection mechanisms and examines the benefits of semantic attributes; describes how to segregate meaningful body parts from background clutter; examines the use of 3D depth images and contextual constraints derived from the visual appearance of a group; reviews approaches to feature transfer function and distance metric learning and discusses potential solutions to issues of data scalability and identity inference; investigates the limitations of existing benchmark datasets, presents strategies for camera topology inference and describes techniques for improving post-rank search efficiency; explores the design rationale and implementation considerations of building a practical re-identification system.

Domain Adaptation and Representation Transfer, and Distributed and Collaborative Learning

Domain Adaptation and Representation Transfer, and Distributed and Collaborative Learning PDF Author: Shadi Albarqouni
Publisher: Springer Nature
ISBN: 3030605485
Category : Computers
Languages : en
Pages : 224

Get Book Here

Book Description
This book constitutes the refereed proceedings of the Second MICCAI Workshop on Domain Adaptation and Representation Transfer, DART 2020, and the First MICCAI Workshop on Distributed and Collaborative Learning, DCL 2020, held in conjunction with MICCAI 2020 in October 2020. The conference was planned to take place in Lima, Peru, but changed to an online format due to the Coronavirus pandemic. For DART 2020, 12 full papers were accepted from 18 submissions. They deal with methodological advancements and ideas that can improve the applicability of machine learning (ML)/deep learning (DL) approaches to clinical settings by making them robust and consistent across different domains. For DCL 2020, the 8 papers included in this book were accepted from a total of 12 submissions. They focus on the comparison, evaluation and discussion of methodological advancement and practical ideas about machine learning applied to problems where data cannot be stored in centralized databases; where information privacy is a priority; where it is necessary to deliver strong guarantees on the amount and nature of private information that may be revealed by the model as a result of training; and where it's necessary to orchestrate, manage and direct clusters of nodes participating in the same learning task.

Image Analysis and Processing – ICIAP 2022

Image Analysis and Processing – ICIAP 2022 PDF Author: Stan Sclaroff
Publisher: Springer Nature
ISBN: 3031064275
Category : Computers
Languages : en
Pages : 816

Get Book Here

Book Description
The proceedings set LNCS 13231, 13232, and 13233 constitutes the refereed proceedings of the 21st International Conference on Image Analysis and Processing, ICIAP 2022, which was held during May 23-27, 2022, in Lecce, Italy, The 168 papers included in the proceedings were carefully reviewed and selected from 307 submissions. They deal with video analysis and understanding; pattern recognition and machine learning; deep learning; multi-view geometry and 3D computer vision; image analysis, detection and recognition; multimedia; biomedical and assistive technology; digital forensics and biometrics; image processing for cultural heritage; robot vision; etc.