Author: A. Clifford Cohen
Publisher:
ISBN:
Category : Distribution (Probability theory)
Languages : en
Pages : 100
Book Description
Estimation in Mixed Frequency Distributions
Author: A. Clifford Cohen
Publisher:
ISBN:
Category : Distribution (Probability theory)
Languages : en
Pages : 100
Book Description
Publisher:
ISBN:
Category : Distribution (Probability theory)
Languages : en
Pages : 100
Book Description
Estimating Frequency Distributions in Data Streams
Author: Justin Y. Chen
Publisher:
ISBN:
Category :
Languages : en
Pages : 0
Book Description
Streaming algorithms allow for space-efficient processing of massive datasets. The distribution of the frequencies of items in a large dataset is often used to characterize that data: e.g., the data is heavy-tailed, the data follows a power law, or there are many elements that only appear only once or twice. In this thesis, we focus on the problem of estimating the profile (a vector representation of the frequency distribution). Given a sequence of m elements from a universe of size n, its profile is a vector [phi] whose i-th entry [phi][subscript i] represents the number of distinct elements that appear in the stream exactly i times. A classic paper by Datar and Muthukrishan from 2002 gave an algorithm which estimates any entry [phi][subscript i] up to an additive error of ±[epsilon]D using O(1/[epsilon]2 log(nm)) bits of space, where D is the number of distinct elements in the stream. We considerably improve on this result by designing an algorithm which estimates the whole profile vector [phi], up to overall error ±[epsilon]m, using O(1/[epsilon]2 log(1/[epsilon]) + log(nm)) bits. More formally, we give an algorithm that computes an approximate profile [phi]̂ such that the L1 distance [parallel lines][phi] - [phi]̂[parallel lines]1 is at most [epsilon]m. In addition to bounding the error across all coordinates, our space bound separates the terms that depend on 1/[epsilon] and those that depend on n and m. Furthermore, we give a lower bound showing that our bound is optimal up to constant factors. "To achieve these results, we introduce two new techniques. First, we develop hashing-based sketches that keep very limited information about the identities of the hashed elements. As a result, elements with different frequencies are mixed together, and need to be unmixed using an iterative "deconvolution" process. Second, we reduce the randomness used by the algorithms in a somewhat subtle way: we first use Nisans generator to ensure that the random variables of interest are O(1)-wise independent, and then we analyze those variables by calculating their moments. (In our setting, using Nisans generator alone would not yield the desired space bound.) The latter technique seems quite versatile, and has been already used for other streaming problems [Ano23].
Publisher:
ISBN:
Category :
Languages : en
Pages : 0
Book Description
Streaming algorithms allow for space-efficient processing of massive datasets. The distribution of the frequencies of items in a large dataset is often used to characterize that data: e.g., the data is heavy-tailed, the data follows a power law, or there are many elements that only appear only once or twice. In this thesis, we focus on the problem of estimating the profile (a vector representation of the frequency distribution). Given a sequence of m elements from a universe of size n, its profile is a vector [phi] whose i-th entry [phi][subscript i] represents the number of distinct elements that appear in the stream exactly i times. A classic paper by Datar and Muthukrishan from 2002 gave an algorithm which estimates any entry [phi][subscript i] up to an additive error of ±[epsilon]D using O(1/[epsilon]2 log(nm)) bits of space, where D is the number of distinct elements in the stream. We considerably improve on this result by designing an algorithm which estimates the whole profile vector [phi], up to overall error ±[epsilon]m, using O(1/[epsilon]2 log(1/[epsilon]) + log(nm)) bits. More formally, we give an algorithm that computes an approximate profile [phi]̂ such that the L1 distance [parallel lines][phi] - [phi]̂[parallel lines]1 is at most [epsilon]m. In addition to bounding the error across all coordinates, our space bound separates the terms that depend on 1/[epsilon] and those that depend on n and m. Furthermore, we give a lower bound showing that our bound is optimal up to constant factors. "To achieve these results, we introduce two new techniques. First, we develop hashing-based sketches that keep very limited information about the identities of the hashed elements. As a result, elements with different frequencies are mixed together, and need to be unmixed using an iterative "deconvolution" process. Second, we reduce the randomness used by the algorithms in a somewhat subtle way: we first use Nisans generator to ensure that the random variables of interest are O(1)-wise independent, and then we analyze those variables by calculating their moments. (In our setting, using Nisans generator alone would not yield the desired space bound.) The latter technique seems quite versatile, and has been already used for other streaming problems [Ano23].
Regional Frequency Analysis
Author: J. R. M. Hosking
Publisher: Cambridge University Press
ISBN: 0521430453
Category : Mathematics
Languages : en
Pages : 240
Book Description
This book is the first complete account of the L-moment approach to regional frequency analysis of environmental extremes.
Publisher: Cambridge University Press
ISBN: 0521430453
Category : Mathematics
Languages : en
Pages : 240
Book Description
This book is the first complete account of the L-moment approach to regional frequency analysis of environmental extremes.
Government-wide Index to Federal Research & Development Reports
Author:
Publisher:
ISBN:
Category : Government publications
Languages : en
Pages : 1352
Book Description
Publisher:
ISBN:
Category : Government publications
Languages : en
Pages : 1352
Book Description
The Estimation and Tracking of Frequency
Author: B. G. Quinn
Publisher: Cambridge University Press
ISBN: 9780521804462
Category : Computers
Languages : en
Pages : 282
Book Description
This book presents practical techniques for estimating frequencies of signals. Includes Matlab code. For researchers.
Publisher: Cambridge University Press
ISBN: 9780521804462
Category : Computers
Languages : en
Pages : 282
Book Description
This book presents practical techniques for estimating frequencies of signals. Includes Matlab code. For researchers.
Estimation with Mixed Data Frequencies
Author: Anisha Ghosh
Publisher:
ISBN:
Category :
Languages : en
Pages :
Book Description
We propose a solution to the measurement error problem that plagues the estimation of the relation between the expected return of the stock market and its conditional variance due to the latency of these conditional moments. We use intra-period returns to construct a nonparametric proxy for the latent conditional variance in the first step which is subsequently used as an input in the second step to estimate the parameters characterizing the risk-return tradeoff via a GMM approach. We propose a bias-correction to the standard GMM estimator derived under a double asymptotic framework, wherein the number of intra-period returns, N, as well as the number of low frequency time periods, T , simultaneously go to infinity. Simulation exercises show that the bias-correction is particularly relevant for small values of N which is the case in empirically realistic scenarios. The methodology lends itself to additional applications, such as the empirical evaluation of factor models, wherein the factor betas may be estimated using intra-period returns and the unexplained returns or alphas subsequently recovered at lower frequencies.
Publisher:
ISBN:
Category :
Languages : en
Pages :
Book Description
We propose a solution to the measurement error problem that plagues the estimation of the relation between the expected return of the stock market and its conditional variance due to the latency of these conditional moments. We use intra-period returns to construct a nonparametric proxy for the latent conditional variance in the first step which is subsequently used as an input in the second step to estimate the parameters characterizing the risk-return tradeoff via a GMM approach. We propose a bias-correction to the standard GMM estimator derived under a double asymptotic framework, wherein the number of intra-period returns, N, as well as the number of low frequency time periods, T , simultaneously go to infinity. Simulation exercises show that the bias-correction is particularly relevant for small values of N which is the case in empirically realistic scenarios. The methodology lends itself to additional applications, such as the empirical evaluation of factor models, wherein the factor betas may be estimated using intra-period returns and the unexplained returns or alphas subsequently recovered at lower frequencies.
U-MIDAS
Author: Claudia Foroni
Publisher:
ISBN: 9783865587817
Category :
Languages : en
Pages : 0
Book Description
Publisher:
ISBN: 9783865587817
Category :
Languages : en
Pages : 0
Book Description
Finite Mixture Distributions
Author: B. Everitt
Publisher: Springer Science & Business Media
ISBN: 9400958978
Category : Science
Languages : en
Pages : 148
Book Description
Finite mixture distributions arise in a variety of applications ranging from the length distribution of fish to the content of DNA in the nuclei of liver cells. The literature surrounding them is large and goes back to the end of the last century when Karl Pearson published his well-known paper on estimating the five parameters in a mixture of two normal distributions. In this text we attempt to review this literature and in addition indicate the practical details of fitting such distributions to sample data. Our hope is that the monograph will be useful to statisticians interested in mixture distributions and to re search workers in other areas applying such distributions to their data. We would like to express our gratitude to Mrs Bertha Lakey for typing the manuscript. Institute oj Psychiatry B. S. Everitt University of London D. l Hand 1980 CHAPTER I General introduction 1. 1 Introduction This monograph is concerned with statistical distributions which can be expressed as superpositions of (usually simpler) component distributions. Such superpositions are termed mixture distributions or compound distributions. For example, the distribution of height in a population of children might be expressed as follows: h(height) = fg(height: age)f(age)d age (1. 1) where g(height: age) is the conditional distribution of height on age, and/(age) is the age distribution of the children in the population.
Publisher: Springer Science & Business Media
ISBN: 9400958978
Category : Science
Languages : en
Pages : 148
Book Description
Finite mixture distributions arise in a variety of applications ranging from the length distribution of fish to the content of DNA in the nuclei of liver cells. The literature surrounding them is large and goes back to the end of the last century when Karl Pearson published his well-known paper on estimating the five parameters in a mixture of two normal distributions. In this text we attempt to review this literature and in addition indicate the practical details of fitting such distributions to sample data. Our hope is that the monograph will be useful to statisticians interested in mixture distributions and to re search workers in other areas applying such distributions to their data. We would like to express our gratitude to Mrs Bertha Lakey for typing the manuscript. Institute oj Psychiatry B. S. Everitt University of London D. l Hand 1980 CHAPTER I General introduction 1. 1 Introduction This monograph is concerned with statistical distributions which can be expressed as superpositions of (usually simpler) component distributions. Such superpositions are termed mixture distributions or compound distributions. For example, the distribution of height in a population of children might be expressed as follows: h(height) = fg(height: age)f(age)d age (1. 1) where g(height: age) is the conditional distribution of height on age, and/(age) is the age distribution of the children in the population.
NBS Special Publication
Author:
Publisher:
ISBN:
Category : Weights and measures
Languages : en
Pages : 574
Book Description
Publisher:
ISBN:
Category : Weights and measures
Languages : en
Pages : 574
Book Description
An Author and Permuted Title Index to Selected Statistical Journals
Author: Brian L. Joiner
Publisher:
ISBN:
Category : Annals of mathematical statistics
Languages : en
Pages : 512
Book Description
All articles, notes, queries, corrigenda, and obituaries appearing in the following journals during the indicated years are indexed: Annals of mathematical statistics, 1961-1969; Biometrics, 1965-1969#3; Biometrics, 1951-1969; Journal of the American Statistical Association, 1956-1969; Journal of the Royal Statistical Society, Series B, 1954-1969,#2; South African statistical journal, 1967-1969,#2; Technometrics, 1959-1969.--p.iv.
Publisher:
ISBN:
Category : Annals of mathematical statistics
Languages : en
Pages : 512
Book Description
All articles, notes, queries, corrigenda, and obituaries appearing in the following journals during the indicated years are indexed: Annals of mathematical statistics, 1961-1969; Biometrics, 1965-1969#3; Biometrics, 1951-1969; Journal of the American Statistical Association, 1956-1969; Journal of the Royal Statistical Society, Series B, 1954-1969,#2; South African statistical journal, 1967-1969,#2; Technometrics, 1959-1969.--p.iv.