Archive for category R

Use sparse data set to create contour map

Problem: We have a sparse array that include some real data. We want to create a contour map for the data. The data set includes x, y, and the measured values.
Solution: I found the following article that provides a complete solution for this problem. Following the link below to read the article.

Software for Exploratory Data Analysis and Statistical Modelling

Then I wrote a R script to create contour map of my measured data.

library(geoR)
temp.df = data.frame(y = mydata$row,
x = mydata$col,
z = mydata$temp)
temp.loess = loess(z ~ x*y, data = temp.df, degree = 2, span = 1)
temp.fit = expand.grid(list(x = seq(1, 16, 0.1), y = seq(1, 4, 0.1)))
z = predict(temp.loess, newdata = temp.fit)
temp.fit$Height = as.numeric(z)

# basic image
image(seq(1, 16, 0.1), seq(1, 4, 0.1), z,
   xlab = "X Coordinate", ylab = "Y Coordinate",
   main = "Surface temp data")
box()

# lattice plot
library(lattice)
levelplot(Height ~ x*y, data = temp.fit,
   xlab = "col", ylab = "row",
   main = "Surface map",
   col.regions = terrain.colors(100)
)
Share

Tags: , , , , ,

Slicing a data frame in R

Slice data rows in a data frame

Normally, we can easily slice a data frame based on given values in a column. Since data frame can be treated as a list, we can use expression “tbData[[colName]]” to represent a column in a data frame. That makes the program more generic. colName includes the column header information.


tbData[tbData[[colName]]==colValue, ]

The %in% operator in R is very useful. It can be used to test if there are common elements in two vectors, especially useful in character testing. It can be used to:
1) Test if shorter vectors are in longer vectors (ex, 6:10 %in% 1:36);
2) Test which elements of long vectors are in short vector (ex, 1:36 %in% 6:10)
3) Used in character vectors or factors (ex, c(“d”, “e”) %in% c(“a”, “b”, “c”, “d”))
If you want to know the indexes of the specific elements inside a larger vector, use function “which” to achieve this goal. Find the indexes of which elements of (1:36 %in% 1:6) are: which(1:36 %in% 6:10).
By using this function, we can slice a data frame based on a given string or string vector.


myDf[myDf$contrast %in% compstr, ]

Slice data columns in a data frame

In R slicing data columns in a data frame is pretty simple. Either using the index information of columns or directly using column names.
Using index or position of columns to slice the data frame is more generic.


df[c(1, 4:ncol(df))]

Using columns to slice the dat frame is useful when you know the column headers before you slice it.


df[c('c1', 'c5', 'c10')]
Share

Tags: , , , ,

Construct R formula with variable names in a vector automatically

Suppose you have a list of variable names in a vector and you want to construct some standard linear model from the vector, how can we do it in R. Actually it is pretty easy in R. By using two R functions, that is,  formula and paste, linear model can be constructed automatically

1.additive model

PredictorVariables<- c("x1","x2")

Apply approach: We can then construct a formula as follows:
PredictorVariables <- paste(“x”, 1:100, sep=””)
Apply approach: We can then construct a formula as follows:

Formula <- formula(paste("y ~ ", 
     paste(PredictorVariables, collapse=" + ")))
lm(Formula, Data)

2.multiplicative model
Apply approach: We can then construct a formula as follows:
PredictorVariables <- paste(“x”, 1:100, sep=””)
Apply approach: We can then construct a formula as follows:

Formula <- formula(paste("y ~ ", 
     paste(PredictorVariables, collapse=" * ")))
lm(Formula, Data)

If you have more than two variable names in the vector, the mode can be quite complex, including higher order of interactions.

By using the similar idea, you can also include random terms in the constructed model.

Appendix: Get all column headers that fit to certain patterns.
Suppose I have data frame called my.df and there are some column headers include string “trt_”, we can use the following methods to get all columns headers with “trt_” and store them into a vector.


my.colnames <- colnames(my.df)
my.headers <- my.colnames[grepl("trt_", my.colnames)]
Share

Tags: , ,

Generate PowerPoint slides in R

A new R package “officer” makes generating PowerPoint slides in R much easier. It is newly developed and still has bugs. But there are ways to get around the bugs. I believe the developer of the package will fix bugs quickly.

First, you have to install the package before using it.

install.packages("officer")

Second, use the examples in the GitHub site and other sites as the start point to create your own PowerPoint slide. I found this site (http://lenkiefer.com/2017/09/23/crafting-a-powerpoint-presentation-with-r/) is really useful.

Third, get around bugs. When I used it to automate the generation of PowerPoint slides, I found that all slides with images inserted did not work at all. After PowerPoint files were generated, they cannot be open correctly. Always popup error message and let you “repair” the presentations. I searched several hours and did not get any useful information. However, I visited “officer” package’s GitHub page and found a hit in bug submitting area. The reason popping up error message is a bug in the program. All image filenames cannot include “SPACE”, otherwise, you get error.

So, I went ahead and changed my program to make sure there is no “space”s in any image filenames. Waoh-lah, it works great.

Another useful point is the slide size. If you want to insert an image to occupy the whole 9:16 widescreen slide, you should use the following parameters:

left = 0.0
top = 0.0
width = 40.0/3.0
height = 7.

Use this set of parameters in ph_with_img_at function, you can guarantee inserted images take the whole slide.

Enjoying to use the package. Thank the developer sharing the wonderful art of work with us.

Share

Tags: , , , ,

Using rsqlserver package in R

In R programming environment, you can conveniently connect to SQL Server through RSQLServer package. However, it is little bit tricky. Not like to way you connect MySQL server through RMySQL package. I searched RSQLServer on Google and tried to find a good example to connect to the SQL server. At the beginning, nothing worked. One time I read a post and used the method in the post. I was able to connect to the SQL Server. However, the connection opened a system database instead of the database I wanted to connect. Eventually I use R help command to learn how to use dbConnect function. That helps figuring out the correct method to connect to the right database on the SQL server.

1) Install the package: RSQLServer. Run command in R:


install.packages("RSQLServer")

2) To use the package, you only need to use the following R command:


library(DBI)

3) To connect to a specific database on a given SQL Server:


con < - dbConnect( RSQLServer::SQLServer(), server="localhost", database = "yourdatabasename", properties=list(user="yourusername", password="yourpassword") )

That should work well for you. You have to provide your SQL Server IP address, database name, your login information to the function.

One get get connected, to query the database table and manipulate data in the database, you can use all the functions available in DBI package. There is no more tricks.

Share

Tags: , , ,