Multimodal Learning toward Micro-Video Understanding

Multimodal Learning toward Micro-Video Understanding PDF Author: Liqiang Nie
Publisher: Springer Nature
ISBN: 3031022556
Category : Technology & Engineering
Languages : en
Pages : 170

Get Book Here

Book Description
Micro-videos, a new form of user-generated contents, have been spreading widely across various social platforms, such as Vine, Kuaishou, and Tik Tok. Different from traditional long videos, micro-videos are usually recorded by smart mobile devices at any place within a few seconds. Due to its brevity and low bandwidth cost, micro-videos are gaining increasing user enthusiasm. The blossoming of micro-videos opens the door to the possibility of many promising applications, ranging from network content caching to online advertising. Thus, it is highly desirable to develop an effective scheme for the high-order micro-video understanding. Micro-video understanding is, however, non-trivial due to the following challenges: (1) how to represent micro-videos that only convey one or few high-level themes or concepts; (2) how to utilize the hierarchical structure of the venue categories to guide the micro-video analysis; (3) how to alleviate the influence of low-quality caused by complex surrounding environments and the camera shake; (4) how to model the multimodal sequential data, {i.e.}, textual, acoustic, visual, and social modalities, to enhance the micro-video understanding; and (5) how to construct large-scale benchmark datasets for the analysis? These challenges have been largely unexplored to date. In this book, we focus on addressing the challenges presented above by proposing some state-of-the-art multimodal learning theories. To demonstrate the effectiveness of these models, we apply them to three practical tasks of micro-video understanding: popularity prediction, venue category estimation, and micro-video routing. Particularly, we first build three large-scale real-world micro-video datasets for these practical tasks. We then present a multimodal transductive learning framework for micro-video popularity prediction. Furthermore, we introduce several multimodal cooperative learning approaches and a multimodal transfer learning scheme for micro-video venue category estimation. Meanwhile, we develop a multimodal sequential learning approach for micro-video recommendation. Finally, we conclude the book and figure out the future research directions in multimodal learning toward micro-video understanding.

Multimodal Learning toward Micro-Video Understanding

Multimodal Learning toward Micro-Video Understanding PDF Author: Liqiang Nie
Publisher: Springer Nature
ISBN: 3031022556
Category : Technology & Engineering
Languages : en
Pages : 170

Get Book Here

Book Description
Micro-videos, a new form of user-generated contents, have been spreading widely across various social platforms, such as Vine, Kuaishou, and Tik Tok. Different from traditional long videos, micro-videos are usually recorded by smart mobile devices at any place within a few seconds. Due to its brevity and low bandwidth cost, micro-videos are gaining increasing user enthusiasm. The blossoming of micro-videos opens the door to the possibility of many promising applications, ranging from network content caching to online advertising. Thus, it is highly desirable to develop an effective scheme for the high-order micro-video understanding. Micro-video understanding is, however, non-trivial due to the following challenges: (1) how to represent micro-videos that only convey one or few high-level themes or concepts; (2) how to utilize the hierarchical structure of the venue categories to guide the micro-video analysis; (3) how to alleviate the influence of low-quality caused by complex surrounding environments and the camera shake; (4) how to model the multimodal sequential data, {i.e.}, textual, acoustic, visual, and social modalities, to enhance the micro-video understanding; and (5) how to construct large-scale benchmark datasets for the analysis? These challenges have been largely unexplored to date. In this book, we focus on addressing the challenges presented above by proposing some state-of-the-art multimodal learning theories. To demonstrate the effectiveness of these models, we apply them to three practical tasks of micro-video understanding: popularity prediction, venue category estimation, and micro-video routing. Particularly, we first build three large-scale real-world micro-video datasets for these practical tasks. We then present a multimodal transductive learning framework for micro-video popularity prediction. Furthermore, we introduce several multimodal cooperative learning approaches and a multimodal transfer learning scheme for micro-video venue category estimation. Meanwhile, we develop a multimodal sequential learning approach for micro-video recommendation. Finally, we conclude the book and figure out the future research directions in multimodal learning toward micro-video understanding.

Image Fusion in Remote Sensing

Image Fusion in Remote Sensing PDF Author: Arian Azarang
Publisher: Springer Nature
ISBN: 3031022564
Category : Technology & Engineering
Languages : en
Pages : 89

Get Book Here

Book Description
Image fusion in remote sensing or pansharpening involves fusing spatial (panchromatic) and spectral (multispectral) images that are captured by different sensors on satellites. This book addresses image fusion approaches for remote sensing applications. Both conventional and deep learning approaches are covered. First, the conventional approaches to image fusion in remote sensing are discussed. These approaches include component substitution, multi-resolution, and model-based algorithms. Then, the recently developed deep learning approaches involving single-objective and multi-objective loss functions are discussed. Experimental results are provided comparing conventional and deep learning approaches in terms of both low-resolution and full-resolution objective metrics that are commonly used in remote sensing. The book is concluded by stating anticipated future trends in pansharpening or image fusion in remote sensing.

ECAI 2023

ECAI 2023 PDF Author: K. Gal
Publisher: IOS Press
ISBN: 164368437X
Category : Computers
Languages : en
Pages : 3328

Get Book Here

Book Description
Artificial intelligence, or AI, now affects the day-to-day life of almost everyone on the planet, and continues to be a perennial hot topic in the news. This book presents the proceedings of ECAI 2023, the 26th European Conference on Artificial Intelligence, and of PAIS 2023, the 12th Conference on Prestigious Applications of Intelligent Systems, held from 30 September to 4 October 2023 and on 3 October 2023 respectively in Kraków, Poland. Since 1974, ECAI has been the premier venue for presenting AI research in Europe, and this annual conference has become the place for researchers and practitioners of AI to discuss the latest trends and challenges in all subfields of AI, and to demonstrate innovative applications and uses of advanced AI technology. ECAI 2023 received 1896 submissions – a record number – of which 1691 were retained for review, ultimately resulting in an acceptance rate of 23%. The 390 papers included here, cover topics including machine learning, natural language processing, multi agent systems, and vision and knowledge representation and reasoning. PAIS 2023 received 17 submissions, of which 10 were accepted after a rigorous review process. Those 10 papers cover topics ranging from fostering better working environments, behavior modeling and citizen science to large language models and neuro-symbolic applications, and are also included here. Presenting a comprehensive overview of current research and developments in AI, the book will be of interest to all those working in the field.

Graph Learning for Fashion Compatibility Modeling

Graph Learning for Fashion Compatibility Modeling PDF Author: Weili Guan
Publisher: Springer Nature
ISBN: 3031188179
Category : Computers
Languages : en
Pages : 120

Get Book Here

Book Description
This book sheds light on state-of-the-art theories for more challenging outfit compatibility modeling scenarios. In particular, this book presents several cutting-edge graph learning techniques that can be used for outfit compatibility modeling. Due to its remarkable economic value, fashion compatibility modeling has gained increasing research attention in recent years. Although great efforts have been dedicated to this research area, previous studies mainly focused on fashion compatibility modeling for outfits that only involved two items and overlooked the fact that each outfit may be composed of a variable number of items. This book develops a series of graph-learning based outfit compatibility modeling schemes, all of which have been proven to be effective over several public real-world datasets. This systematic approach benefits readers by introducing the techniques for compatibility modeling of outfits that involve a variable number of composing items. To deal with the challenging task of outfit compatibility modeling, this book provides comprehensive solutions, including correlation-oriented graph learning, modality-oriented graph learning, unsupervised disentangled graph learning, partially supervised disentangled graph learning, and metapath-guided heterogeneous graph learning. Moreover, this book sheds light on research frontiers that can inspire future research directions for scientists and researchers.

Pattern Recognition and Computer Vision

Pattern Recognition and Computer Vision PDF Author: Shiqi Yu
Publisher: Springer Nature
ISBN: 3031189078
Category : Computers
Languages : en
Pages : 842

Get Book Here

Book Description
The 4-volume set LNCS 13534, 13535, 13536 and 13537 constitutes the refereed proceedings of the 5th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2022, held in Shenzhen, China, in November 2022. The 233 full papers presented were carefully reviewed and selected from 564 submissions. The papers have been organized in the following topical sections: Theories and Feature Extraction; Machine learning, Multimedia and Multimodal; Optimization and Neural Network and Deep Learning; Biomedical Image Processing and Analysis; Pattern Classification and Clustering; 3D Computer Vision and Reconstruction, Robots and Autonomous Driving; Recognition, Remote Sensing; Vision Analysis and Understanding; Image Processing and Low-level Vision; Object Detection, Segmentation and Tracking.

Web Information Systems Engineering – WISE 2024

Web Information Systems Engineering – WISE 2024 PDF Author: Mahmoud Barhamgi
Publisher: Springer Nature
ISBN: 9819605709
Category :
Languages : en
Pages : 476

Get Book Here

Book Description


Compatibility Modeling

Compatibility Modeling PDF Author: Xuemeng Song
Publisher: Springer Nature
ISBN: 3031023218
Category : Computers
Languages : en
Pages : 118

Get Book Here

Book Description
Nowadays, fashion has become an essential aspect of people's daily life. As each outfit usually comprises several complementary items, such as a top, bottom, shoes, and accessories, a proper outfit largely relies on the harmonious matching of these items. Nevertheless, not everyone is good at outfit composition, especially those who have a poor fashion aesthetic. Fortunately, in recent years the number of online fashion-oriented communities, like IQON and Chictopia, as well as e-commerce sites, like Amazon and eBay, has grown. The tremendous amount of real-world data regarding people's various fashion behaviors has opened a door to automatic clothing matching. Despite its significant value, compatibility modeling for clothing matching that assesses the compatibility score for a given set of (equal or more than two) fashion items, e.g., a blouse and a skirt, yields tough challenges: (a) the absence of comprehensive benchmark; (b) comprehensive compatibility modeling with the multi-modal feature variables is largely untapped; (c) how to utilize the domain knowledge to guide the machine learning; (d) how to enhance the interpretability of the compatibility modeling; and (e) how to model the user factor in the personalized compatibility modeling. These challenges have been largely unexplored to date. In this book, we shed light on several state-of-the-art theories on compatibility modeling. In particular, to facilitate the research, we first build three large-scale benchmark datasets from different online fashion websites, including IQON and Amazon. We then introduce a general data-driven compatibility modeling scheme based on advanced neural networks. To make use of the abundant fashion domain knowledge, i.e., clothing matching rules, we next present a novel knowledge-guided compatibility modeling framework. Thereafter, to enhance the model interpretability, we put forward a prototype-wise interpretable compatibility modeling approach. Following that, noticing the subjective aesthetics of users, we extend the general compatibility modeling to the personalized version. Moreover, we further study the real-world problem of personalized capsule wardrobe creation, aiming to generate a minimum collection of garments that is both compatible and suitable for the user. Finally, we conclude the book and present future research directions, such as the generative compatibility modeling, virtual try-on with arbitrary poses, and clothing generation.

Multidimensional Signal Processing: Methods and Applications

Multidimensional Signal Processing: Methods and Applications PDF Author: Roumen Kountchev
Publisher: Springer Nature
ISBN: 9819751810
Category :
Languages : en
Pages : 400

Get Book Here

Book Description


Doing a Master's Dissertation in TESOL and Applied Linguistics

Doing a Master's Dissertation in TESOL and Applied Linguistics PDF Author: Lindy Woodrow
Publisher: Routledge
ISBN: 0429995776
Category : Language Arts & Disciplines
Languages : en
Pages : 225

Get Book Here

Book Description
Doing a Master’s Dissertation in TESOL and Applied Linguistics is a practical guide for master’s students tackling research and research writing for the first time. Structured for use in class or as part of an independent study, and divided into the four stages of designing, researching, writing up and submitting a dissertation, this book: carefully guides readers from the very beginning of producing a research proposal, all the way through to assessment procedures and the provisions for resubmission; covers publishing your dissertation and applying for higher research degrees, including funding; addresses all the most fundamental concerns students have about master’s dissertations, including how to choose a topic and conducting a literature review; draws upon examples from master’s dissertations from the UK, US and Australia and provides numerous ‘how-to’ tables and checklists; and includes activities and resources to facilitate master’s research and dissertation writing, as well as FAQs and solutions at the end of each chapter. Tailormade for MA students in TESOL or Applied Linguistics, this book is essential reading for students on these degrees around the world as well as for their supervisors and programme directors.

Automated Machine Learning and Meta-Learning for Multimedia

Automated Machine Learning and Meta-Learning for Multimedia PDF Author: Wenwu Zhu
Publisher: Springer Nature
ISBN: 3030881326
Category : Computers
Languages : en
Pages : 240

Get Book Here

Book Description
This book disseminates and promotes the recent research progress and frontier development on AutoML and meta-learning as well as their applications on computer vision, natural language processing, multimedia and data mining related fields. These are exciting and fast-growing research directions in the general field of machine learning. The authors advocate novel, high-quality research findings, and innovative solutions to the challenging problems in AutoML and meta-learning. This topic is at the core of the scope of artificial intelligence, and is attractive to audience from both academia and industry. This book is highly accessible to the whole machine learning community, including: researchers, students and practitioners who are interested in AutoML, meta-learning, and their applications in multimedia, computer vision, natural language processing and data mining related tasks. The book is self-contained and designed for introductory and intermediate audiences. No special prerequisite knowledge is required to read this book.