Sequential Decision Making in Dynamic Systems

Sequential Decision Making in Dynamic Systems PDF Author: Yixuan Zhai
Publisher:
ISBN: 9781369343229
Category :
Languages : en
Pages :

Get Book Here

Book Description
We study sequential decision-making problems in the presence of uncertainty in dynamic pricing, intrusion detection, and routing in communication networks. A decision maker is usually able to learn from the feedback (observations) in sequential decision-making problems. We consider designing optimal strategies and analyze their performance. In the first part, we consider a dynamic pricing problem under unknown demand models. We start with a monopoly dynamic pricing problem. In this problem, a seller offers prices to a stream of customers and observes either success or failure in each sale attempt. The underlying demand model is unknown to the seller and can take one of M possible forms. We show that this problem can be formulated as a multi-armed bandit with dependent arms. We propose a dynamic pricing policy based on the likelihood ratio test. It is shown that the proposed policy achieves complete learning, i.e. it offers a bounded regret where regret is defined as the revenue loss with respect to the case with a known demand model. This is in sharp contrast with the logarithmic growing regret in multi-armed bandit with independent arms. Later, we consider an oligopoly dynamic pricing problem with a finite uncertainty of demand models. Besides just considering the learning efficiency, we assume that sellers are individually rational and consider strategies within the set of certain kind of equilibria. We formulate the oligopoly problem as a repeated Bertrand game with incomplete information. Two scenarios are investigated, sellers with equal marginal costs or asymmetric marginal cost. For the scenarios with equal marginal costs, we developed a dynamic pricing strategy called Competitive and Cooperative Demand Learning (CCDL). Under CCDL, all sellers would collude and obtain the same average total profit as a monopoly. The strategy is shown to be a subgame perfect Nash equilibrium and Pareto efficient. We further show that the proposed competitive pricing strategy achieves a bounded regret, where regret is defined as the total expected loss in profit with respect to the ideal scenario of a known demand model. For the scenarios with asymmetric marginal costs, a dynamic pricing strategy called Demand Learning under Collusion (DLC) is developed. If sellers are patient enough, a tactic collusion of a subset of sellers may be formed depending on the marginal costs and underlying demand model. Using the limit of means criterion, DLC is shown to be a subgame-perfect and Pareto-efficient equilibrium. The dynamic pricing strategy offers a bounded regret over an infinite horizon. Using discounting criterion, DLC is shown to be subgame-perfect [epsilon]-equilibrium, [epsilon]-efficient and with an arbitrarily small regret. The dual problem as an infinitely repeated Cournot competition is formulated and the economic efficiency measured by the social welfare is discussed between Bertrand and Cournot formulations. In the second part, we consider an intrusion detection problem and formulate it as a dynamic search of a target located in one of K cells with any fixed number of searches. At each time, one cell is searched, and the search result is subject to false alarms. The objective is a policy that governs the sequential selection of the cells to minimize the error probability of detecting the whereabouts of the target within a fixed time horizon. We show that the optimal search policy is myopic in nature with a simple structure. In the third part, we consider the shortest path routing problem in a communication network with random link costs drawn from unknown distributions. A realization of the total end-to-end cost is obtained when a path is selected for communication. The objective is an online learning algorithm that minimizes the total expected communication cost in the long run. The problem is formulated as a multi-armed bandit problem with dependent arms, and an algorithm based on basis-based learning integrated with a Best Linear Unbiased Estimator (BLUE) is developed.

Sequential Decision Making in Dynamic Systems

Sequential Decision Making in Dynamic Systems PDF Author: Yixuan Zhai
Publisher:
ISBN: 9781369343229
Category :
Languages : en
Pages :

Get Book Here

Book Description
We study sequential decision-making problems in the presence of uncertainty in dynamic pricing, intrusion detection, and routing in communication networks. A decision maker is usually able to learn from the feedback (observations) in sequential decision-making problems. We consider designing optimal strategies and analyze their performance. In the first part, we consider a dynamic pricing problem under unknown demand models. We start with a monopoly dynamic pricing problem. In this problem, a seller offers prices to a stream of customers and observes either success or failure in each sale attempt. The underlying demand model is unknown to the seller and can take one of M possible forms. We show that this problem can be formulated as a multi-armed bandit with dependent arms. We propose a dynamic pricing policy based on the likelihood ratio test. It is shown that the proposed policy achieves complete learning, i.e. it offers a bounded regret where regret is defined as the revenue loss with respect to the case with a known demand model. This is in sharp contrast with the logarithmic growing regret in multi-armed bandit with independent arms. Later, we consider an oligopoly dynamic pricing problem with a finite uncertainty of demand models. Besides just considering the learning efficiency, we assume that sellers are individually rational and consider strategies within the set of certain kind of equilibria. We formulate the oligopoly problem as a repeated Bertrand game with incomplete information. Two scenarios are investigated, sellers with equal marginal costs or asymmetric marginal cost. For the scenarios with equal marginal costs, we developed a dynamic pricing strategy called Competitive and Cooperative Demand Learning (CCDL). Under CCDL, all sellers would collude and obtain the same average total profit as a monopoly. The strategy is shown to be a subgame perfect Nash equilibrium and Pareto efficient. We further show that the proposed competitive pricing strategy achieves a bounded regret, where regret is defined as the total expected loss in profit with respect to the ideal scenario of a known demand model. For the scenarios with asymmetric marginal costs, a dynamic pricing strategy called Demand Learning under Collusion (DLC) is developed. If sellers are patient enough, a tactic collusion of a subset of sellers may be formed depending on the marginal costs and underlying demand model. Using the limit of means criterion, DLC is shown to be a subgame-perfect and Pareto-efficient equilibrium. The dynamic pricing strategy offers a bounded regret over an infinite horizon. Using discounting criterion, DLC is shown to be subgame-perfect [epsilon]-equilibrium, [epsilon]-efficient and with an arbitrarily small regret. The dual problem as an infinitely repeated Cournot competition is formulated and the economic efficiency measured by the social welfare is discussed between Bertrand and Cournot formulations. In the second part, we consider an intrusion detection problem and formulate it as a dynamic search of a target located in one of K cells with any fixed number of searches. At each time, one cell is searched, and the search result is subject to false alarms. The objective is a policy that governs the sequential selection of the cells to minimize the error probability of detecting the whereabouts of the target within a fixed time horizon. We show that the optimal search policy is myopic in nature with a simple structure. In the third part, we consider the shortest path routing problem in a communication network with random link costs drawn from unknown distributions. A realization of the total end-to-end cost is obtained when a path is selected for communication. The objective is an online learning algorithm that minimizes the total expected communication cost in the long run. The problem is formulated as a multi-armed bandit problem with dependent arms, and an algorithm based on basis-based learning integrated with a Best Linear Unbiased Estimator (BLUE) is developed.

Anticipatory Optimization for Dynamic Decision Making

Anticipatory Optimization for Dynamic Decision Making PDF Author: Stephan Meisel
Publisher: Springer Science & Business Media
ISBN: 146140505X
Category : Business & Economics
Languages : en
Pages : 192

Get Book Here

Book Description
The availability of today’s online information systems rapidly increases the relevance of dynamic decision making within a large number of operational contexts. Whenever a sequence of interdependent decisions occurs, making a single decision raises the need for anticipation of its future impact on the entire decision process. Anticipatory support is needed for a broad variety of dynamic and stochastic decision problems from different operational contexts such as finance, energy management, manufacturing and transportation. Example problems include asset allocation, feed-in of electricity produced by wind power as well as scheduling and routing. All these problems entail a sequence of decisions contributing to an overall goal and taking place in the course of a certain period of time. Each of the decisions is derived by solution of an optimization problem. As a consequence a stochastic and dynamic decision problem resolves into a series of optimization problems to be formulated and solved by anticipation of the remaining decision process. However, actually solving a dynamic decision problem by means of approximate dynamic programming still is a major scientific challenge. Most of the work done so far is devoted to problems allowing for formulation of the underlying optimization problems as linear programs. Problem domains like scheduling and routing, where linear programming typically does not produce a significant benefit for problem solving, have not been considered so far. Therefore, the industry demand for dynamic scheduling and routing is still predominantly satisfied by purely heuristic approaches to anticipatory decision making. Although this may work well for certain dynamic decision problems, these approaches lack transferability of findings to other, related problems. This book has serves two major purposes: ‐ It provides a comprehensive and unique view of anticipatory optimization for dynamic decision making. It fully integrates Markov decision processes, dynamic programming, data mining and optimization and introduces a new perspective on approximate dynamic programming. Moreover, the book identifies different degrees of anticipation, enabling an assessment of specific approaches to dynamic decision making. ‐ It shows for the first time how to successfully solve a dynamic vehicle routing problem by approximate dynamic programming. It elaborates on every building block required for this kind of approach to dynamic vehicle routing. Thereby the book has a pioneering character and is intended to provide a footing for the dynamic vehicle routing community.

Control and Dynamic Systems V28

Control and Dynamic Systems V28 PDF Author: C.T. Leonides
Publisher: Elsevier
ISBN: 0323162681
Category : Technology & Engineering
Languages : en
Pages : 363

Get Book Here

Book Description
Control and Dynamic Systems: Advances in Theory in Applications, Volume 28: Advances in Algorithms and Computational Techniques in Dynamic Systems Control, Part 1 of 3 discusses developments in algorithms and computational techniques for control and dynamic systems. This book presents algorithms and numerical techniques used for the analysis and control design of stochastic linear systems with multiplicative and additive noise. It also discusses computational techniques for the matrix pseudoinverse in minimum variance reduced-order filtering and control; decomposition technique in multiobjective discrete-time dynamic problems; computational techniques in robotic systems; reduced complexity algorithm using microprocessors; algorithms for image-based tracking; and modeling of linear and nonlinear systems. This volume will be an important reference source for practitioners in the field who are looking for techniques with significant applied implications.

The Logic of Adaptive Behavior

The Logic of Adaptive Behavior PDF Author: Martijn van Otterlo
Publisher: IOS Press
ISBN: 1586039695
Category : Business & Economics
Languages : en
Pages : 508

Get Book Here

Book Description
Markov decision processes have become the de facto standard in modeling and solving sequential decision making problems under uncertainty. This book studies lifting Markov decision processes, reinforcement learning and dynamic programming to the first-order (or, relational) setting.

Dynamical Systems

Dynamical Systems PDF Author: Zeraoulia Elhadj
Publisher: CRC Press
ISBN: 0429647425
Category : Mathematics
Languages : en
Pages : 189

Get Book Here

Book Description
Chaos is the idea that a system will produce very different long-term behaviors when the initial conditions are perturbed only slightly. Chaos is used for novel, time- or energy-critical interdisciplinary applications. Examples include high-performance circuits and devices, liquid mixing, chemical reactions, biological systems, crisis management, secure information processing, and critical decision-making in politics, economics, as well as military applications, etc. This book presents the latest investigations in the theory of chaotic systems and their dynamics. The book covers some theoretical aspects of the subject arising in the study of both discrete and continuous-time chaotic dynamical systems. This book presents the state-of-the-art of the more advanced studies of chaotic dynamical systems.

Control and Dynamic Systems Volume 36

Control and Dynamic Systems Volume 36 PDF Author: Richard A Leondes
Publisher: Newnes
ISBN: 0323139515
Category : Science
Languages : en
Pages : 423

Get Book Here

Book Description
Control and Dynamic Systems: Advances in Theory and Applications, Volume 36 reviews advances in theory and applications of large scale control and dynamic systems. Contributors focus on production control and the determination of optimal production rates, along with active control systems, uncertainty in control system design, and methods for analyzing multistage commodity markets. This volume is organized into eight chapters and begins with an introduction to multiobjective decision-tree analysis and its significance in applied situations, with two substantive examples. It then shifts to important techniques for the determination of robust economic policies, methods used in the analysis of multistage commodity markets, and a computationally effective algorithm for the determination of the optimal production rate. This book also describes many highly effective techniques for near optimal and robust model truncation. Robust adaptive identification and control algorithms for disturbances and unmodeled system dynamics are given consideration. The final chapter provides examples of the applied significance of the techniques presented in this book, including such large scale systems areas as aerospace, defense, chemical, environmental, and infrastructural industries. This book will be of interest to students and researchers in engineering and computer science.

Handbook of Reinforcement Learning and Control

Handbook of Reinforcement Learning and Control PDF Author: Kyriakos G. Vamvoudakis
Publisher: Springer Nature
ISBN: 3030609901
Category : Technology & Engineering
Languages : en
Pages : 833

Get Book Here

Book Description
This handbook presents state-of-the-art research in reinforcement learning, focusing on its applications in the control and game theory of dynamic systems and future directions for related research and technology. The contributions gathered in this book deal with challenges faced when using learning and adaptation methods to solve academic and industrial problems, such as optimization in dynamic environments with single and multiple agents, convergence and performance analysis, and online implementation. They explore means by which these difficulties can be solved, and cover a wide range of related topics including: deep learning; artificial intelligence; applications of game theory; mixed modality learning; and multi-agent reinforcement learning. Practicing engineers and scholars in the field of machine learning, game theory, and autonomous control will find the Handbook of Reinforcement Learning and Control to be thought-provoking, instructive and informative.

Change Detection and Input Design in Dynamical Systems

Change Detection and Input Design in Dynamical Systems PDF Author: Feza Kerestecioğlu
Publisher: Wiley-Blackwell
ISBN:
Category : Technology & Engineering
Languages : en
Pages : 176

Get Book Here

Book Description
Concerned with the detection and diagnosis of abrupt changes in dynamic systems operating in a noisy environment and the design of auxiliary inputs to enhance such procedures. The inputs are chosen to minimize the average detection time while ensuring a specified false alarm rate. Offline and online inputs are considered, and the results are applicable to many possible changes. Annotation copyright by Book News, Inc., Portland, OR

Patterns, Predictions, and Actions: Foundations of Machine Learning

Patterns, Predictions, and Actions: Foundations of Machine Learning PDF Author: Moritz Hardt
Publisher: Princeton University Press
ISBN: 0691233721
Category : Computers
Languages : en
Pages : 321

Get Book Here

Book Description
An authoritative, up-to-date graduate textbook on machine learning that highlights its historical context and societal impacts Patterns, Predictions, and Actions introduces graduate students to the essentials of machine learning while offering invaluable perspective on its history and social implications. Beginning with the foundations of decision making, Moritz Hardt and Benjamin Recht explain how representation, optimization, and generalization are the constituents of supervised learning. They go on to provide self-contained discussions of causality, the practice of causal inference, sequential decision making, and reinforcement learning, equipping readers with the concepts and tools they need to assess the consequences that may arise from acting on statistical decisions. Provides a modern introduction to machine learning, showing how data patterns support predictions and consequential actions Pays special attention to societal impacts and fairness in decision making Traces the development of machine learning from its origins to today Features a novel chapter on machine learning benchmarks and datasets Invites readers from all backgrounds, requiring some experience with probability, calculus, and linear algebra An essential textbook for students and a guide for researchers

Optimization in Planning and Operation of Electric Power Systems

Optimization in Planning and Operation of Electric Power Systems PDF Author: Karl Frauendorfer
Publisher: Springer Science & Business Media
ISBN: 366212646X
Category : Business & Economics
Languages : en
Pages : 362

Get Book Here

Book Description
Permanently increasing requirements in power supply necessitate efficient control of electric power systems. An emerging subject of importance is optimization. Papers on modelling aspects of unit commitment and optimal power flow provide the introduction to power systems control and to its associated problem statement. Due to the nature of the underlying optimization problems recent developments in advanced and well established mathematical programming methodologies are presented, illustrating in which way dynamic, separable, continuous and stochastic features might be exploited. In completing the various methodologies a number of presentations have stated experiences with optimization packages currently used for unit commitment and optimal power flow calculations. This work represents a state-of-the-art of mathematical programming methodologies, unit commitment, optimal power flow and their applications in power system control.