QUESTIONS - Intro
Analytics Modeling
Questions and Answers
(Latest Update 2025-2026)
Approximate dynamic program - ANSWERS-Dynamic programming model where the
value functions are approximated.
Dynamic programming - ANSWERS-Optimization approach that involves making a
sequence of decisions over time, based on the current state of a system.
Earth - ANSWERS-Name of many implementations of multi-adaptive regression spline
(MARS) model, because "MARS" is a trademark.
Edge - ANSWERS-Connection between two nodes/vertices in a network. In a network
model, there is a variable for each edge, equal to the amount of flow on the arc, and
(optionally) a capacity constraint on the edge's flow. Also called an arc.
Eigenvalue - ANSWERS-Amount by which an eigenvector gets rescaled in a linear
transformation.
,Arc - ANSWERS-Connection between two nodes/vertices in a network. In a network
model, there is a variable for each arc, equal to the amount of flow on the arc, and
(optionally) a capacity constraint on the arc's flow. Also called an edge.
Area under the curve (AUC) - ANSWERS-Area under the ROC curve; an estimate of
the classification model's accuracy. Also called concordance index.
ARIMA - ANSWERS-Autoregressive integrated moving average.
Arrival Rate - ANSWERS-Expected number of arrivals of people, things, etc. per unit
time -- for example, the expected number of truck deliveries per hour to a warehouse.
Assignment Problem - ANSWERS-Network optimization model with two sets of nodes,
that finds the best way to assign each node in one set to each node in the other set.
Attribute - ANSWERS-A characteristic or measurement - for example, a person's height
or the color of a car. Generally interchangeable with "feature", and often with "covariate"
or "predictor". In the standard tabular format, a column of data.
Autoregression - ANSWERS-Regression technique using past values of time series
data as predictors of future values.
Autoregressive integrated moving average (ARIMA) - ANSWERS-Time series model
that uses differences between observations when data is nonstationary. Also called
Box-Jenkins.
Backward elimination - ANSWERS-Variable selection process that starts with all
variables and then iteratively removes the least-immediately-relevant variables from the
model.
Balanced Design - ANSWERS-Set of combinations of factor values across multiple
factors, that has the same number of runs for all combinations of levels of one or more
factors.
Balking - ANSWERS-An entity arrives to the queue, sees the size of the line (or some
other attribute), and decides to leave the system.
Bayes' theorem/Bayes' rule - ANSWERS-Fundamental rule of conditional probability:
𝑃(𝐴|𝐵)=𝑃(𝐵|𝐴)*𝑃(𝐴) / 𝑃(𝐵)
Bayesian Information criterion (BIC) - ANSWERS-Model selection technique that trades
off model fit and model complexity. When comparing models, the model with lower BIC
is preferred. Generally penalizes complexity more than AIC.
Bayesian Regression - ANSWERS-Regression model that incorporates estimates of
how coefficients and error are distributed.
, Bellman's Equation - ANSWERS-Equation used in dynamic programming that ensures
optimality of a solution.
Bernoulli Distribution - ANSWERS-Discrete probability distribution where the outcome is
binary, either 0 or 1. Often, 1 represents success and 0 represents failure. The
probability of the outcome being 1 is 𝑝 and the probability of outcome being 0 is 𝑞 =
1−𝑝, where 𝑝 is between 0 and 1.
Bias - ANSWERS-Systematic difference between a true parameter of a population and
its estimate.
Binary Data - ANSWERS-Data that can take only two different values (true/false, 0/1,
black/white, on/off, etc.)
Binary integer program - ANSWERS-Integer program where all variables are binary
variables.
Binary Variable - ANSWERS-Variable that can take just two values: 0 and 1.
Binomial Distribution - ANSWERS-Discrete probability distribution for the exact number
of successes, k, out of a total of n iid Bernoulli trials, each with probability p: Pr(𝑘)= (n
over k) p^k(1-p)^n-k
Blocking - ANSWERS-Factor introduced to an experimental design that interacts with
the effect of the factors to be studied. The effect of the factors is studied within the same
level (block) of the blocking factor.
box and whisker plot - ANSWERS-Graphical representation data showing the middle
range of data (the "box"), reasonable ranges of variability ("whiskers"), and points
(possible outliers) outside those ranges.
Box-Cox Transformation - ANSWERS-Transformation of a non-normally-distributed
response to a normal distribution.
Branching - ANSWERS-Splitting a set of data into two or more subsets, to each be
analyzed separately.
CART - ANSWERS-Classification and regression trees.
Categorical Data - ANSWERS-Data that classifies observations without quantitative
meaning (for example, colors of cars) or where quantitative amounts are categorized
(for example, "0-10, 11-20, ...").
Causation - ANSWERS-Relationship in which one thing makes another happen (i.e.,
one thing causes another).