profile image

Estelle Scifo

(Graph) Data Scientist & Python developer

Bayesian statistics: short review of existing ressources

December 29, 2018 (last updated on 2019/01/22)

Bayesian analysis is something I have used during my PhD. During some recent research (holiday hobby), I came accross several very well written ressources on the web I want to share here.

Key concepts

If you want to understand what is Bayesian analysis and why we need it, I particularly appreciate this video:

In-depth understanding

To go beyond those key concepts and understand how to use Bayesian analysis in real life, those two ressouces are a good start. Both show practical examples using PyMC and/or PyMC3.

  • Bayesian Methods for Hackers: “online book” proposed by Cameron Davidson-Pilon. Written using jupyter notebook, this tutorial is well documented and easy to follow. The recommanded way to read the book is even to clone the repo to run the notebooks locally in order to be able to make changes and see the impact. You can see examples both with PyMC and PyMC3. Note that a printed version exists, consider buying it if you like this work and can afford it.

  • Bayesian Analysis with Python: this book has been written by Osvaldo Martin. Free version of the first edition are regularly made available from the publisher (Free learning of the day offer). It has the advantage of being more exhaustive than the previous one.

  • Statistical Rethinking with Python and PyMC3: again, those are jupyter notebooks based on the example written in the book “Statistical Rethinking”. Here you can also find links to lecture material and videos.

Tools

Beyond PyMC3 The previous ressources already presented some tools to use bayesian analysis in real life examples. Here are some more:

  • STAN and PySTAN
  • A comparison between PyMC, STAN and a third package (Edward) can be found here
  • ArviZ a powerfull library to create very interesting plots related to bayes analysis (compatible with both PyMC3 and PySTAN).

Future

PyMC4 is already in the pipes! Main change is related to the deprecation of theano, used as backend to PyMC3.

Of course, there are many more ressources on the topic, I will update the list when I come accross a new interesting approach.

UPDATE 2019-01-22 added reference to “Statistical Rethinking”, STAN and ArviZ