Data analysis, Statistics

Computational Plant Science – Clustering in R Tutorial

Last week was the Plantae Seminar "Computational Plant Science - Science at the Interface of Math, Computer Science, and Plant Biology with Alexander Bucksch": Dr. Bucksch mentions using two clustering techniques - B splines and K-means clustering. I've discussed K-means clustering in a previous post to analyze predictors of success in Settlers of Catan. B splines… Continue reading Computational Plant Science – Clustering in R Tutorial

Computation, Statistics

Introduction to Bayesian Statistics, Part 3

The two most popular Markov chain Monte Carlo sampling algorithms are Gibbs sampling and Metropolis Hastings. These algorithms produce Markov chains. Numbers inside a Markov chain are dependent on only the previous number. In the context of sampling, we check the probability of the proposed value based on only the probability of the current value,… Continue reading Introduction to Bayesian Statistics, Part 3

Data analysis, Statistics

Introduction to Bayesian statistics, Part 2

As mentioned in Part 1, in Bayesian statistics you summarize a priori knowledge in the prior, and your data in the likelihood. The prior distribution is often chosen based on analytical convenience, while the likelihood is chosen based on the underlying sampling distribution (read about some appropriate distributions here). Multiplying these together produces the posterior distribution. Probability… Continue reading Introduction to Bayesian statistics, Part 2

Statistics

Probability distributions that aren’t Normal

Many people are aware of the normal distribution or "bell curve". What are some other probability distributions and when are they useful? You can think of a probability distribution as a collection of the number of times something happened. For example, how many students get which grade (70%, 73%, 94%, etc). We can visualize this… Continue reading Probability distributions that aren’t Normal

Computation, Data analysis, Misc, Statistics

Random Chance in Settlers of Catan

Games of chance are often people’s first exposure to statistics. Settlers of Catan is a game that revolves around the probability distribution of two independent 6-sided die rolls. The game consists of hexagons with one of four possible resources available. These hexagons are normally in a random configuration. Each hexagon receives a random number token.… Continue reading Random Chance in Settlers of Catan

Data analysis, Statistics

Analysis of the predictors of Pokemon strength

My presentation on the best predictors of Pokemon strength (as measured by the sum of Pokemon statistics) were analyzed using clustering methods.