List of All Flashcard Terms¶

Link to the Interactive Flashcards

Here is a list of all of the data science terms included in these flashcards:

Basic Data Science Terms

  • data
  • data science
  • data point
  • variables
  • features
  • quantitative data
  • qualitative data
  • research question

Basics of Random Experiments

  • scatter plot
  • histogram
  • relative frequency
  • random experiment
  • outcome
  • set
  • event
  • event class
  • fair experiment
  • probability

Basics of Hypothesis Testing and Summary Statistics

  • disjoint
  • set
  • partition
    (data set)
  • statistical hypothesis
  • binary hypothesis test
  • null hypothesis
    (for multiple groups)
  • model-based methods
  • model-free methods
  • resampling
  • plot legend
  • outlier
  • mode
  • median
  • average (or sample mean) of a data set

Probability Spaces and Combinatorics

  • event class
  • power set
  • probability measure
  • composite experiment
  • trial (compound experiments)
  • repeated experiment
  • statistical regularity
  • outcome (of a random experiment)
  • sample space
  • event
  • fair (random experiment)
  • combinatorics
  • Cartesian product
  • permutation

Statistical Studies and Null Hypothesis Significance Testing

  • Type-I error
  • Type-II error
  • power
  • exact permutation test
  • bootstrap distribution
  • confidence interval (CI)
  • statistical study
  • experimental study or experiment
  • observational study
  • randomized control trial (RCT)
  • natural experiment
  • population
  • population study
  • sample
  • cross-sectional study
  • longitudinal study
  • longitudinal cohort study
  • prospective study
  • retrospective study
  • post hoc analysis
  • selection bias
  • sampling distribution
  • mode(s) of a distribution
  • unimodal distribution
  • tail probability
  • right, or upper, tail
  • left, or lower, tail
  • two-sided tail

Conditional Probability and Statistical Independence

  • conditional probability
  • statistically independent
    (two events)
  • statistically independent
    (any number of events)
  • pairwise statistically independent
  • statistically independent (s.i.)
    vs
    mutually exclusive (m.e.)
  • conditional independence (events)

Bayes Rule and Optimal Decisions

  • stochastic system
  • likelihoods
    (discrete stochastic systems)
  • a posteriori probability
    (discrete stochastic system)
  • a priori probability
    (discrete stochastic system)
  • Bayes' Rule
    (discrete stochastic system)
  • base rate fallacy
  • hidden state
  • decision rule
    (discrete stochastic system)
  • maximum likelihood (ML) rule
    (discrete stochastic system)
  • MAP rule
    (discrete stochastic system)
  • uninformative prior
  • informative prior
  • credible interval

Random Variables

  • Borel sets (of $\mathbb{R}$)
  • Borel field or Borel $\sigma$-algebra
  • random variable
  • range (of a random variable)
  • discrete random variable
  • probability mass function (PMF)
  • cumulative distribution function (CDF)
  • staircase function
  • survival function (SF)
  • discrete uniform random variable
  • Bernoulli random variable
  • Binomial random variable
  • Geometric random variable
  • Poisson random variable
  • continuous uniform random variable
  • probability density function (pdf)
  • piecewise function
  • (Continuous) Uniform RV
  • Exponential RV
  • inverse CDF
  • Normal (Gaussian) RV
  • Chi-squared RV
  • Student's $t$ RV

Expected Values and Estimation

  • expected value
    (discrete random variable)
  • expected value
    (continuous random variable)
  • mode (of a random variable)
  • median (of a random variable)
  • $n$th moment
  • Law of the Unconscious Statistician
    (LOTUS)
  • $n$th central moment
  • variance
    (random variable)
  • variance of a constant:
    $\operatorname{Var}[c]$
  • variance when adding a constant
    $\operatorname{Var}[X+c]$
  • variance when multiplying by a constant
    $\operatorname{Var}[cX]$
  • variance of sum of independent random variables
    \begin{equation*} \operatorname{Var} \left[ \sum_{i=0}^{N-1} X_i \right] \end{equation*}
  • vector
  • estimate
  • estimator
  • estimator error
  • estimator bias
  • unbiased estimator
  • standard error of the mean
    SEM
  • sampling distribution
  • effect size

Point Conditioning, Non-Bayesian and Bayesian Decision Rules with Continuous Observations

  • likelihood for discrete-input, continuous-output systems
  • receiver operating characteristic (ROC) curve
  • area under curve (AUC)
  • point conditioning
  • a posteriori probability for discrete-input, continuous-output systems
  • total probability for CDFs
  • total probability for pdfs
  • total probability for events with point conditioning

Categorical Data, Contingency Tables, and Chi-Squared Tests

  • categorical data
  • ordinal data
  • nominal data
  • contingency table
  • degrees of freedom
    (contingency table)
  • one-way table

Covariance, Correlation, and Linear Regression

  • vector
  • component or element (vector)
  • scalar
  • size (of a vector)
  • zero vector
  • ones vector
  • standard unit vector
  • vector addition
  • scalar-vector multiplication
  • component-wise vector multiplication
    (Hadamard product)
  • dot product/
    inner product
  • norm squared
  • norm
  • distance (vectors)
  • transpose
  • covariance
    (random variables)
  • covariance
    (data vectors)
  • correlation coefficient
    (random variables)
  • correlation coefficient
    (data vectors)
  • explanatory variable
  • response variable
  • coefficient of determination
    (simple linear regression)
  • total variance
    (simple linear regression)
  • explained variance
    (simple linear regression)

Jointly Distributed Random Variables, KLT, and PCA

  • joint probability mass function
    (pair of random variables)
  • joint cumulative distribution function
    (pair of random variables)
  • joint probability density function
    (pair of random variables)
  • marginal probability density function
    (pair of random variables)
  • contour of equal probability density
    (pair of random variables)
  • random vector
  • mean vector
  • covariance matrix
  • correlation coefficient
    (random variables)
  • uncorrelated
  • correlation matrix
  • iid
  • broadcasting
  • standardization
  • eigenvector
  • eigenvalue
  • characteristic equation
  • modal matrix
  • eigendecomposition
  • relating determinant and eigenvalues
  • dimensionality reduction
  • Karhunen-Loève Transform
    (KLT)
  • principal components analysis
    (PCA)
  • scree plot
  • explained variance
  • test-train split

Link to the Interactive Flashcards