Discipline: Mathematics and Statistics
Subcategory: Mathematics and Statistics
Session: 2
Room: Exhibit Hall A
Lauren Chandler-Holtz - Northwestern University
Co-Author(s): Amir Samiepour, Southwestern University, Georgetown, TX; John Ford, Middle Tennessee State University, Murfreesboro, TN; Dr. Rachel Leander, Middle Tennessee State University, Murfreesboro, TN
The need to analyze first event time data, such as measurements of the time it takes a cell to pass a cell cycle checkpoint, appears frequently in many different research problems. These first event times are more readily observable than the underlying process that leads up to these events. Luckily, we can glean information about the fundamental mechanism from the distribution of first event times. We often wish to estimate the parameters of a stochastic process from the distribution of first event times using maximum likelihood parameter estimation, which involves repeated evaluations of the first event time density. The maximum likelihood estimation as many desirable properties: it is usually unbiased, consistent, and efficient. However, the density of first event times may be unknown or computationally expensive to evaluate. To combat this, work has recently been done to investigate approximate maximum likelihood estimation as an alternative to maximum likelihood estimation. In this work, we compare and evaluate various methods of approximate maximum likelihood estimation: a histogram over the data range with fixed bin width, uniform kernels centered on the data with fixed width, and uniform kernels centered on the data with variable width. In the approximate maximum likelihood parameter estimation, the underlying stochastic process is simulated to obtain an approximation of the probability density function. Here, we vary the numerous simulation parameters, i.e. the number of simulations run, to observe how they affect the approximation, analyze the results to determine the efficacy of each approximate maximum likelihood estimation method, and determine whether each approximate maximum likelihood estimate of the parameter converges to the true maximum likelihood estimate as the number of simulations increases. When comparing methods of approximate likelihood estimation, we use synthetic data from a Poisson process. Slow rates of error convergence, using two different error models, suggested the bias persists for very large simulation numbers. As the number of simulations increased, the bin width for the histogram and data-centered, fixed bin width methods converged to zero. The bias of the estimate derived from the histogram method appears to converge to zero, while the bias in the data-centered, fixed bin width estimate appeared to persist as the bin width approached zero. For the histogram-based approach, the error is biased at times and unbiased at others. Persistent bias in the data-centered fixed bin width approach also comes from the differing supports of this approach and the Poisson model. Our approach with data-centered, varying bin widths lessens this bias.
Funder Acknowledgement(s): We thank the National Science Foundation for providing funding and the opportunity for research and learning through REU Award #1757493.
Faculty Advisor: Dr. Rachel Leander, Rachel.leander@mtsu.edu
Role: In collaboration with partner: create simulations, approximations, and data analysis measurements/charts using MATLAB. Also, created poster and presented and conducted a literature review. In collaboration with partner and mentors: data/result analysis, future directions planning