Image and Video Text Recognition Using Convolutional Neural Networks

Image and Video Text Recognition Using Convolutional Neural Networks PDF Author: Zohra Saidane
Publisher: LAP Lambert Academic Publishing
ISBN: 9783844324617
Category : Graph theory
Languages : en
Pages : 156

Get Book Here

Book Description
Thanks to increasingly powerful storage media, multimedia resources have become nowadays essential resources and the challenge is how to quickly find relevant information. To accomplish this task, the text within images and videos can be a relevant key. In this work we focus on recognizing the content of the text and we assume that the text box has been detected and located correctly. We focused on a particular machine learning algorithm called convolutional neural networks (CNNs). These are networks of neurons whose topology is similar to the mammalian visual cortex. CNNs were initially used for recognition of handwritten digits. They were then applied successfully on many problems of pattern recognition. We propose in this work a new method of binarization of text images, a new method for segmentation of text images, the study of a convolutional neural network for character recognition in images, a discussion on the relevance of the binarization step in the recognition of text in images based on machine learning methods, and a new method of text recognition in images based on graph theory.

Image and Video Text Recognition Using Convolutional Neural Networks

Image and Video Text Recognition Using Convolutional Neural Networks PDF Author: Zohra Saidane
Publisher: LAP Lambert Academic Publishing
ISBN: 9783844324617
Category : Graph theory
Languages : en
Pages : 156

Get Book Here

Book Description
Thanks to increasingly powerful storage media, multimedia resources have become nowadays essential resources and the challenge is how to quickly find relevant information. To accomplish this task, the text within images and videos can be a relevant key. In this work we focus on recognizing the content of the text and we assume that the text box has been detected and located correctly. We focused on a particular machine learning algorithm called convolutional neural networks (CNNs). These are networks of neurons whose topology is similar to the mammalian visual cortex. CNNs were initially used for recognition of handwritten digits. They were then applied successfully on many problems of pattern recognition. We propose in this work a new method of binarization of text images, a new method for segmentation of text images, the study of a convolutional neural network for character recognition in images, a discussion on the relevance of the binarization step in the recognition of text in images based on machine learning methods, and a new method of text recognition in images based on graph theory.

Grokking Machine Learning

Grokking Machine Learning PDF Author: Luis Serrano
Publisher: Simon and Schuster
ISBN: 1617295914
Category : Computers
Languages : en
Pages : 510

Get Book Here

Book Description
Grokking Machine Learning presents machine learning algorithms and techniques in a way that anyone can understand. This book skips the confused academic jargon and offers clear explanations that require only basic algebra. As you go, you'll build interesting projects with Python, including models for spam detection and image recognition. You'll also pick up practical skills for cleaning and preparing data.

Scene Text Understanding in Natural Images With Convolutional Neural Networks

Scene Text Understanding in Natural Images With Convolutional Neural Networks PDF Author: Dafang He
Publisher:
ISBN:
Category :
Languages : en
Pages :

Get Book Here

Book Description
Text in images contains rich semantic information.The ability to read text could be used in many different applications, ranging from autonomous driving, image or video indexing, as well as assistive technology for visually impaired people. This problem is typically called scene text understanding.In order to understand text in natural images, we usually have several sub-fields related to it: (1) Scene text detection. (2) Scene text recognition and (3) Scene Text verification or retrieval.In this dissertation, I am going to investigate scene text understanding with a focus on text detection and text verification. Scene text detection aims at finding the location of each text instance.Usually we expect the model to predict a bounding box for each text instance.It shares several common difficulties with regular object detection such as noisy image, variance of scales and etc.However, one of the major difference between regular object detection and scene text detection is that we usually need to predict an oriented or even curved bounding box for each text instance.Scene text recognition usually follows scene text detection in an end-to-end text reading system.The model needs to transcribe each single text instance.Scene text verification verifies the existence of text in natural images.It is the most critical part in building a scene text retrieval system.In this dissertation, I am going to explore various methods for scene text detection and verification with convolutional neural network(CNN).Specifically, for scene text detection, I propose three algorithms and one training framework.The first algorithm adopts a traditional region proposal method with a novel CNN classifier which aggregates local context into classification.The second detection algorithm uses fully convolutional neural network for semantic text segmentation.A novel instance-aware segmentation is proposed to further split the extracted text block into text instances.The third work focuses on arbitrary oriented scene text detection.It proposes a general and novel framework called Detect-Associate-Segment (DAS) for detecting arbitrary oriented text.A key point based model is designed based on the framework which achieves state-of-the-art performance in various benchmark datasets.In addition to detection algorithms, this dissertation also explores a new training framework for scene text detection.A novel contour task is introduced to assist scene text detection and improves the final performance.For scene text verification, this dissertation studies a new end-to-end model design which outperforms traditional algorithms by a large margin.It is demonstrated on a large scale scene text dataset with millions of street view images.

Deep Learning for Computer Vision

Deep Learning for Computer Vision PDF Author: Rajalingappaa Shanmugamani
Publisher: Packt Publishing Ltd
ISBN: 1788293355
Category : Computers
Languages : en
Pages : 304

Get Book Here

Book Description
Learn how to model and train advanced neural networks to implement a variety of Computer Vision tasks Key Features Train different kinds of deep learning model from scratch to solve specific problems in Computer Vision Combine the power of Python, Keras, and TensorFlow to build deep learning models for object detection, image classification, similarity learning, image captioning, and more Includes tips on optimizing and improving the performance of your models under various constraints Book Description Deep learning has shown its power in several application areas of Artificial Intelligence, especially in Computer Vision. Computer Vision is the science of understanding and manipulating images, and finds enormous applications in the areas of robotics, automation, and so on. This book will also show you, with practical examples, how to develop Computer Vision applications by leveraging the power of deep learning. In this book, you will learn different techniques related to object classification, object detection, image segmentation, captioning, image generation, face analysis, and more. You will also explore their applications using popular Python libraries such as TensorFlow and Keras. This book will help you master state-of-the-art, deep learning algorithms and their implementation. What you will learn Set up an environment for deep learning with Python, TensorFlow, and Keras Define and train a model for image and video classification Use features from a pre-trained Convolutional Neural Network model for image retrieval Understand and implement object detection using the real-world Pedestrian Detection scenario Learn about various problems in image captioning and how to overcome them by training images and text together Implement similarity matching and train a model for face recognition Understand the concept of generative models and use them for image generation Deploy your deep learning models and optimize them for high performance Who this book is for This book is targeted at data scientists and Computer Vision practitioners who wish to apply the concepts of Deep Learning to overcome any problem related to Computer Vision. A basic knowledge of programming in Python—and some understanding of machine learning concepts—is required to get the best out of this book.

Label Text Recognition Using Image Processing Techniques and Convolutional Neural Networks for Smart Library

Label Text Recognition Using Image Processing Techniques and Convolutional Neural Networks for Smart Library PDF Author:
Publisher:
ISBN:
Category :
Languages : en
Pages :

Get Book Here

Book Description
Abstract : The library is a common and important public facility in contemporary society for information dissemination and cultural communication. A library usually holds thousands of books and each book is assigned to a unique position so that the visitors can find the books easily by checking the database of the library. However, the misplaced books bring trouble for readers to find them. Therefore, finding out these misplaced books and rearranging them is one of the important jobs for librarians. In this thesis, a convolutional-neural-network-based book label recognition algorithm is proposed to help librarians finding out the misplaced books by scanning the book labels. The algorithm is divided into two parts: the first part applies image processing techniques to extract the characters of the labels attached to each book from the images of the bookshelves. The second part uses convolutional neural networks (CNNs) to train a classifier for recognizing characters. In this part, a CNN architecture with four convolutional layers is designed to train classifiers for classifying characters and numbers that are used for recognition of the text. Finally, the algorithm combines the results of the two parts to recognize the text of a label by classifying each character or number of the text.

Cognitively Inspired Video Text Processing

Cognitively Inspired Video Text Processing PDF Author: Palaiahnakote Shivakumara
Publisher: Springer Nature
ISBN: 9811670692
Category : Computers
Languages : en
Pages : 283

Get Book Here

Book Description
As technologies are fast advancing, the importance of text detection and recognition is receiving special attention from the researchers. Thus, one can see several real-time applications of video text processing which requires cognitive-based methods to find a solution. The main applications are (1) retrieving and indexing video based on semantic of the content of the video, (2) machine translation to assist foreigners, (3) assisting blind people to walk on the road freely without aid, (4) automatic vehicle driving, (5) license plate tracing to catch vehicles which violate the traffic signals, (6) monitoring the images posted on social media based on text and content of the images, (7) identifying the location based on the address of the street and shops, etc., (8) tracing players in the sports based on the jersey/bib number or text, and (9) in the same way, tracing the bib number in case of marathon and other events. For the above-mentioned applications, text detection and recognition in video and natural scene images is an integral part of the system.

AI 2003: Advances in Artificial Intelligence

AI 2003: Advances in Artificial Intelligence PDF Author: Tamas D. Gedeon
Publisher: Springer Science & Business Media
ISBN: 3540206469
Category : Computers
Languages : en
Pages : 1095

Get Book Here

Book Description
This book constitutes the refereed proceedings of the 16th Australian Conference on Artificial Intelligence, AI 2003, held in Perth, Australia in December 2003. The 87 revised full papers presented together with 4 keynote papers were carefully reviewed and selected from 179 submissions. The papers are organized in topical sections on ontologies, problem solving, knowledge discovery and data mining, expert systems, neural network applications, belief revision and theorem proving, reasoning and logic, machine learning, AI applications, neural computing, intelligent agents, computer vision, medical applications, machine learning and language, AI and business, soft computing, language understanding, and theory.

A Guide to Convolutional Neural Networks for Computer Vision

A Guide to Convolutional Neural Networks for Computer Vision PDF Author: Salman Khan
Publisher:
ISBN: 9781681732787
Category :
Languages : en
Pages : 207

Get Book Here

Book Description
Computer vision has become increasingly important and effective in recent years due to its wide-ranging applications in areas as diverse as smart surveillance and monitoring, health and medicine, sports and recreation, robotics, drones, and self-driving cars. Visual recognition tasks, such as image classification, localization, and detection, are the core building blocks of many of these applications, and recent developments in Convolutional Neural Networks (CNNs) have led to outstanding performance in these state-of-the-art visual recognition tasks and systems. As a result, CNNs now form the crux of deep learning algorithms in computer vision. This self-contained guide will benefit those who seek to both understand the theory behind CNNs and to gain hands-on experience on the application of CNNs in computer vision. It provides a comprehensive introduction to CNNs starting with the essential concepts behind neural networks: training, regularization, and optimization of CNNs. The book also discusses a wide range of loss functions, network layers, and popular CNN architectures, reviews the different techniques for the evaluation of CNNs, and presents some popular CNN tools and libraries that are commonly used in computer vision. Further, this text describes and discusses case studies that are related to the application of CNN in computer vision, including image classification, object detection, semantic segmentation, scene understanding, and image generation. This book is ideal for undergraduate and graduate students, as no prior background knowledge in the field is required to follow the material, as well as new researchers, developers, engineers, and practitioners who are interested in gaining a quick understanding of CNN models.

Image and Graphics

Image and Graphics PDF Author: Yu-Jin Zhang
Publisher: Springer
ISBN: 3319219693
Category : Computers
Languages : en
Pages : 615

Get Book Here

Book Description
This book constitutes the refereed conference proceedings of the 8th International Conference on Image and Graphics, ICIG 2015 held in Tianjin, China, in August 2015. The 164 revised full papers and 6 special issue papers were carefully reviewed and selected from 339 submissions. The papers focus on various advances of theory, techniques and algorithms in the fields of images and graphics.

Pattern Recognition and Image Analysis

Pattern Recognition and Image Analysis PDF Author: Aythami Morales
Publisher: Springer Nature
ISBN: 3030313328
Category : Computers
Languages : en
Pages : 657

Get Book Here

Book Description
This 2-volume set constitutes the refereed proceedings of the 9th Iberian Conference on Pattern Recognition and Image Analysis, IbPRIA 2019, held in Madrid, Spain, in July 2019. The 99 papers in these volumes were carefully reviewed and selected from 137 submissions. They are organized in topical sections named: Part I: best ranked papers; machine learning; pattern recognition; image processing and representation. Part II: biometrics; handwriting and document analysis; other applications.