Category Archives: R

Learning about Forecasting (and using R)

As I was designing an introductory course in forecasting for International Business students I came across a very interesting book available for download in the web. The book is “Forecasting: Principles and Practice” by Rob J Hyndman and George Athanasopoulos. Among several nice features, I would stress that this book is for (undergraduate and MBA) business students with little formal training in statistics and it presents plenty of examples using R. The third edition uses the tsibble and fable packages, while the second edition uses the forecast package. I really like chapters 2, 3, and 4 where the authors go over exploratory analysis of time series.

Free book about basic statistics with examples in R

An interesting book for those that are starting to learn basic statistics for data analysis is Practical Statistics for Data Scientists by Peter Bruce and Andrew Bruce. It can be downloaded for free here. This book covers the basics of data exploratory analysis and frequentist statistical theory. Additionally, its final chapters cover classification, statistical machine learning, and unsupervised learning. I do like these last chapters as they are carefully written and easy to understand. The examples using R are very useful. In sum, this is a book worth reading for beginners. The last chapters can also be very helpful for those readers with more experience in data analysys.

A comprehensive road map to learn R

Learning a new software or even a new programming language is always an interesting journey. Most of the times the tutorials and books we find are never exactly what we need. A useful resource that I found is the Big Book of R. It is one of the most comprehensive repositories of tutorials and general information about R. You can find suggestions according to your needs or even according to your background, for instance for Journalism, Social Sciences, or Life Sciences.  It has very good sections on Machine Learning and on R programming.

An Introduction to Statistical Learning with Applications in Python

I came across this very interesting Github repository by Qiuping X., in which she posted the codes she prepared in Python for the book “An Introduction to Statistical Learning with Applications in R”  by Gareth James, Daniela Witten,  Trevor Hastie, and Robert Tibshirani.  This is very useful for those that are learning Python and certainly facilitates the migration from R to Python too.

Statistical Learning using R

I recently came across this book titled “An Introduction to Statistical Learning, with Applications in R“.

It can be downloaded for free at the authors webpage, which also contain the R codes, data sets, errata, slides and videos for Statistical Learning MOOC, and other valuable information.

That said, I think this is a very useful book for those interested in Statistical Learning. It is very accessible to most people, since it does not require a strong mathematical background.

For those interested in gaining a deeper understanding of these topics, I strongly suggest the book “The Elements of Statistical Learning“, which is also available for download at no cost.

Some resources for using R with spatial data

Spatial data models and visualizations are important tools in the design of public policies. I provide below some links to tutorials about spatial modelling using R that I have found useful.

The first stop should be at the Spatial Data Science with R website. It contains a wealth of information organized in a straightforward way. My next suggestion is  An Introduction to Spatial Econometrics in R website. A different version of the previous website is also available at R-Bloggers: An Introduction to Spatial Econometrics in R. There is also a tutorial by Luc Anselin: An Introduction to Spatial Regression Analysis in R.

Finally, there is a video made by econometricsacademy about Spatial Econometrics in R.

Reading DATASUS and ANS .dbc files using R

Those interested in conducting research using Brazilian data often come across some unusual file formats used by Brazilian government agencies. One example is the .dbc files used for health data produced by DATASUS and by ANS. These .dbc files are not related to the.dbf files used by FoxPro for instance. In fact, they are just compressed .dbf files.

There is a package in R that can convert these .dbc files into regular .dbf files. This package is named read.dbc. Once you generated the .dbf files, you can use the package foreign and its function read.dbf to import the data into R. This very same package allow you to save the data using Stata .dta format by means of the function write.dta.

The package maptools also has a function to read dbf files named dbf.read, though I have not tried it yet.

 

Statistical Learning using R – A downloadable book

I was looking for some books on Statistical Learning for a new course that I am preparing, and I came across a very interesting book titled “The Elements of Statistical Learning Data Mining, Inference, and Prediction” by Trevor Hastie, Robert Tibshirani, and Jerome Friedman. This book can be downloaded for free. I enjoyed reading this book since it is very well written and contains several examples. Additionally, data and R scripts used in the book are available for download in the book website.

Graphs in R

Carefully thought graphs and diagrams can help you make your point. R has very nice features when it comes to graphs. Nevertheless, things can get cumbersome quickly.
This first link provides a good explanation on how to make simple graphs. This other link contains an encyclopedic treatment of graphs in R.
Finally, in the event you are using R Studio and the plot refuses to show up, go to the Console and type “dev.off()” till you get an error message. Then, try plotting the graph again. Another solution is to restart R Studio. More on this issue can be found here.