Stan is a software for Bayesian computation

just like winbugs and jags. Take a look at winbugs or jags videos in my

channel if you are interested about those software.

Instead of using gibbs sampling, stan uses a method called Hamiltonian monte carlo or

HMC. For complex models, HMC may be more effective

than gibbs sampler. Stan uses C++, so it is fast. But you don’t

have to learn c++ to use it. The language stan uses is similar to winbugs and also there

is a R package to run the Bayesian computation from R.

Let us see how to install it first. In google we search “stan bayesian” to get the home

page of stan. Go to rstan link to get info about how to install it. I am using windows

7. I installed R, and the Rtools. For R go to r-project, get the latest version of R.

I am using R 3.1.2. Also go to Rtools page to download and install Rtools. I am using

Rtools 32. Within R, to install rstan, copy the commands

from the website and run in R. It takes a while to complete the installation. Restart

R to start using stan. Let us explain how rstan works using an example:

popularly known as “eight schools” example. You can get the description of this example

in the Rstan vignette. I am running R from RStudio.

The data need to be declared first in a block. In this example, we first write data block.

In here we express how the data list should look like, with data type and restrictions.

Inside it, number of schools J is declared as an integer, with the restriction that it

can not be lower than 0. A semicolon at the end of each line means that

the command has ended. This semicolon is required at the end of every

command in stan. To make a comment, we use double slash.

Now, y is a vector of real numbers with length J and sigma is a vector of real numbers with

length J with the restriction that lower value of standard errors sigma can be 0.

The parameters are also declared similarly in a separate block. Here we put the parameters

of interest. Mean mu, population standard deviation tau

are scalar with tau having a non-negative restriction. And eta is a vector with J elements.

We can also define transformed identical parameter or new parameter theta for school effects

in another block. This block is optional but sometimes necessary for complex model building. The model block describes the model; that

includes priors and likelihoods. The model specification starts in order with top level

of prior, then secondary levels of prior and so on and likelihood at the end.

Here eta is generated from standard normal and y is generated from normal with mean theta

and standard deviation sigma. Note that, we can generate data in a vectorised

form. We do not need a for loop to do so. Therefore, we can generate data faster.

All the Stan data, parameter and model blocks need to be inserted in R as a string in a

block. To do so, we can save them in another file

with extension .stan. we first set the directory where the stan

model file is saved and then run the stanc command to read the model statements.

We then use stan_model command to translate them into C++ commands. Depending on complexity,

this may take some time to complete the compilation. Next we need to prepare data. The data has

to be in a list format in R. In this list, we have J y and sigma; all the variables we

specified in the data block previously. Once this is complete we can generate monte

carlo posterior sample using sampling command. Alternatively, if we do not want to save the

model blocks separately in an external file, we can save them in R as strings. Then stanc

command is not required. We directly move on to stan_model command with model_code argument.

Rest is the same. We can also directly sample from posterior

distribution by using stan command. This stan function is a wrapper for translating, compiling

and sampling altogether. Additional arguments such as warmup and thin

can be used to do burn-in and thining. To examine the model fit, we can print the

parameters and specified quantiles. After fitting, we can summarize the fit and

plot them. It is also possible to use traceplot function.

It is also possible to extract samples from the fit.

If you prefer to use coda package for summaries, it is possible to convert Rstan object to

coda object as well. Then you can summarize the coda objects.

The links of the codes used in this video are provided in the description of the video.

Feel free to provide your feedback about this video. Thanks.

#bayesian #stan #beginners #mcmc #hmc #r #rstan #jags #bugs #winbugs #openbugs #statistics

Prof,

This is a beautiful new r package for Bayesian analysis.

Please upload more videos to teach us Bayesian modeling. No words to appreciate your wonderful work.

May God keep you longer on this planet, n that way, fewer people will fear mathematics n statistics.

This is great. Thanks so much for adding this. Let us know on the stan-users mailing list if you put up more tutorials.

Thanks, that's a great intro. I normally use JAGS but now having issues with memory and speed. I have problem how to code categorical predictors with more than two categories in the data, model blocks. Do you have any suggested sources? I can send you part of my JAGS code, thanks

This is Amazing, i am new to bayesian statistics. I feel it is good. Your video gives me hope

Thanks, it is a ver helpful video

Wow, huge help! Thank you Ehsan!

how we will set the directory for stan program in rstudio?