Dynamic Pricing and Inventory Control with Learning

Dynamic Pricing and Inventory Control with Learning PDF Author: Nicholas C. Petruzzi
Publisher:
ISBN:
Category :
Languages : en
Pages :

Get Book Here

Book Description


Dynamic Pricing and Inventory Control with Learning

Dynamic Pricing and Inventory Control with Learning PDF Author: Nicholas C. Petruzzi
Publisher:
ISBN:
Category : Inventory control
Languages : en
Pages : 50

Get Book Here

Book Description


Dynamic Pricing and Inventory Control with Fixed Ordering Cost and Incomplete Demand Information

Dynamic Pricing and Inventory Control with Fixed Ordering Cost and Incomplete Demand Information PDF Author: Boxiao Chen
Publisher:
ISBN:
Category :
Languages : en
Pages : 42

Get Book Here

Book Description
We consider the periodic review dynamic pricing and inventory control problem with fixed ordering cost. Demand is random and price dependent, and unsatisfied demand is backlogged. With complete demand information, the celebrated (s,S,p) policy is proved to be optimal, where s and S are the reorder point and order-up-to level for ordering strategy, and p, a function of on-hand inventory level, characterizes the pricing strategy. In this paper, we consider incomplete demand information and develop online learning algorithms whose average profit approaches that of the optimal (s,S,p) with a tight O ̃(√T) regret rate. A number of salient features differentiate our work from the existing online learning researches in the OM literature. First, computing the optimal (s,S,p) policy requires solving a dynamic programming (DP) over multiple periods involving unknown quantities, which is different from the majority of learning problems in operations management that only require solving single-period optimization questions. It is hence challenging to establish stability results through DP recursions, which we accomplish by proving uniform convergence of the profit-to-go function. The necessity of analyzing action-dependent state transition over multiple periods resembles the reinforcement learning question, considerably more difficult than existing bandit learning algorithms. Second, the pricing function p is of infinite dimension, and approaching it is much more challenging than approaching a finite number of parameters as seen in existing researches. The demand-price relationship is estimated based on upper confidence bound, but the confidence interval cannot be explicitly calculated due to the complexity of the DP recursion. Finally, due to the multi-period nature of (s,S,p) policies the actual distribution of the randomness in demand plays an important role in determining the optimal pricing strategy p, which is unknown to the learner a priori. In this paper, the demand randomness is approximated by an empirical distribution constructed using dependent samples, and a novel Wasserstein metric based argument is employed to prove convergence of the empirical distribution.

Optimal Policies for Dynamic Pricing and Inventory Control with Nonparametric Censored Demands

Optimal Policies for Dynamic Pricing and Inventory Control with Nonparametric Censored Demands PDF Author: Boxiao Chen
Publisher:
ISBN:
Category :
Languages : en
Pages : 0

Get Book Here

Book Description
We study the fundamental model in joint pricing and inventory replenishment control under the learning-while-doing framework, with T consecutive review periods and the firm not knowing the demand curve a priori. At the beginning of each period, the retailer makes both a price decision and an inventory order-up-to level decision, and collects revenues from consumers' realized demands while suffering costs from either holding unsold inventory items, or lost sales from unsatisfied customer demands. We make the following contributions to this fundamental problem as follows:1. We propose a novel inversion method based on empirical measures to consistently estimate the difference of the instantaneous reward functions at two prices, directly tackling the fundamental challenge brought by censored demands, without raising the order-up-to levels to unnaturally high levels to collect more demand information. Based on this technical innovation, we design bisection and trisection search methods that attain an O(T^{1/2}) regret, assuming the reward function is concave and only twice continuously differentiable.2. In the more general case of non-concave reward functions, we design an active tournament elimination method that attains O(T^{3/5}) regret, based also on the technical innovation of consistent estimates of reward differences at two prices.3. We complement the O(T^{3/5}) regret upper bound with a matching Omega(T^{3/5}) regret lower bound. The lower bound is established by a novel information-theoretical argument based on generalized squared Hellinger distance, which is significantly different from conventional arguments that are based on Kullback-Leibler divergence. This lower bound shows that no learning-while-doing algorithm could achieve O(T^{1/2}) regret without assuming the reward function is concave, even if the sales revenue as a function of demand rate or price is concave.Both the upper bound technique based on the "difference estimator" and the lower bound technique based on generalized Hellinger distance are new in the literature, and can be potentially applied to solve other inventory or censored demand type problems that involve learning.

Dynamic Pricing With Infrequent Inventory Replenishments

Dynamic Pricing With Infrequent Inventory Replenishments PDF Author: Boxiao Chen
Publisher:
ISBN:
Category :
Languages : en
Pages : 0

Get Book Here

Book Description
We consider a joint pricing and inventory control problem where pricing can be adjusted more frequently, such as every period, than inventory ordering decisions, which are made every epoch that consists of multiple periods. This is motivated by many examples, especially for online retailers, where price is indeed much easier to change than inventory level, because changing the latter is subject to logistic and capacity constraints. In this setting, the retailer determines the inventory level at the beginning of each epoch and solves a dynamic pricing problem within each epoch with no further replenishment opportunities. The optimal pricing and inventory control policy is characterized by an intricate dynamic programming (DP) solution. We consider the situation where the demand-price function and the distribution of random demand noise are both unknown to the retailer, and the retailer needs to develop an online learning algorithm to learn those information and at the same time maximize total profit. We propose a learning algorithm based on least squares estimation and construction of an empirical noise distribution under a UCB framework and prove that the algorithm converges through the DP recursions to approach the optimal pricing and inventory control policy under complete demand information. The theoretical lower bound for convergence rate of a learning algorithm is proved based on the multivariate Van Trees inequality coupled with some structural DP analyses, and we show that the upper bound of our algorithm's convergence rate matches the theoretical lower bound.

Combined Dynamic Pricing and Inventory Control

Combined Dynamic Pricing and Inventory Control PDF Author:
Publisher:
ISBN: 9780599594609
Category :
Languages : en
Pages :

Get Book Here

Book Description


Dynamic Pricing and Inventory Control

Dynamic Pricing and Inventory Control PDF Author: Elodie Adida
Publisher: VDM Publishing
ISBN: 9783836421430
Category : Business & Economics
Languages : en
Pages : 288

Get Book Here

Book Description
(cont.) We introduce and study a solution method that enables to compute the optimal solution on a finite time horizon in a monopoly setting. Our results illustrate the role of capacity and the effects of the dynamic nature of demand. We then introduce an additive model of demand uncertainty. We use a robust optimization approach to protect the solution against data uncertainty in a tractable manner, and without imposing stringent assumptions on available information. We show that the robust formulation is of the same order of complexity as the deterministic problem and demonstrate how to adapt solution method. Finally, we consider a duopoly setting and use a more general model of additive and multiplicative demand uncertainty. We formulate the robust problem as a coupled constraint differential game. Using a quasi-variational inequality reformulation, we prove the existence of Nash equilibria in continuous time and study issues of uniqueness. Finally, we introduce a relaxation-type algorithm and prove its convergence to a particular Nash equilibrium (normalized Nash equilibrium) in discrete time.

Data Based Dynamic Pricing and Inventory Control with Censored Demand and Limited Price Changes

Data Based Dynamic Pricing and Inventory Control with Censored Demand and Limited Price Changes PDF Author: Boxiao Chen
Publisher:
ISBN:
Category :
Languages : en
Pages : 61

Get Book Here

Book Description
A firm makes pricing and inventory replenishment decisions for a product over T periods to maximize its expected total profit. Demand is random and price sensitive, and unsatisfied demands are lost and unobservable (censored demand). The firm knows the demand process up to some parameters and needs to learn them through pricing and inventory experimentation. However, due to business constraints the firm is prevented from making frequent price changes, leading to correlated and dependent sales data. We develop data-driven algorithms by actively experimenting inventory and pricing decisions and construct maximum likelihood estimator with censored and correlated samples for parameter estimation. We analyze the algorithms using the T-period regret, defined as the profit loss of the algorithms over T periods compared with the clairvoyant optimal policy that knew the parameters a priori. For a so-called well-separated case, we show that the regret of our algorithm is O(T^{1/(m+1)}) when the number of price changes is limited by m >= 1, and is O( log T) when limited by beta log T for some positive constant beta>0; while for a more general case, the regret is O(T^{1/2}) when the underlying demand is bounded and O(T^{1/2} log T) when the underlying demand is unbounded. We further prove that our algorithm for each case is the best possible in the sense that its regret rate matches with the theoretical lower bound.

Operationalizing Dynamic Pricing Models

Operationalizing Dynamic Pricing Models PDF Author: Steffen Christ
Publisher: Springer Science & Business Media
ISBN: 3834961841
Category : Business & Economics
Languages : en
Pages : 363

Get Book Here

Book Description
Steffen Christ shows how theoretic optimization models can be operationalized by employing self-learning strategies to construct relevant input variables, such as latent demand and customer price sensitivity.

Dynamic Pricing and Inventory Control for Multiple Products

Dynamic Pricing and Inventory Control for Multiple Products PDF Author: Dimitris Bertsimas
Publisher:
ISBN:
Category :
Languages : en
Pages : 20

Get Book Here

Book Description
A periodical multi-product pricing and inventory control problem with applications to production planning and airline revenue management is studied. The objective function of the single-period model is shown to be convex for certain types of demand distributions, thus tractable for large instances. A heuristic is proposed to solve the more complex multi-period problem, which is an interesting combination of linear and dynamic programming. Numerical experiments and theoretical bounds on the optimal expected revenue suggest that the extent to which a dynamic policy based on a stochastic model will outperform a simple static policy based on a deterministic model depends on the level of demand variability as measured by the coefficient of variation.