Archive for category Programming

PyCharm – The Python IDE crossing multiple platforms

Here is the declaration on the PyCharm site about it.

PyCharm is a dedicated Python and Django IDE providing a wide range of essential tools for Python developers, tightly integrated together to create a convenient environment for productive Python development and Web development.

It is available for Windows, macOS, and Linux. It can be downloaded from http://www.jetbrains.com/pycharm/download/.

I just start using it. I read some good comments about it from the internet.

Share

Tags: , ,

Generate PowerPoint slides in R

A new R package “officer” makes generating PowerPoint slides in R much easier. It is newly developed and still has bugs. But there are ways to get around the bugs. I believe the developer of the package will fix bugs quickly.

First, you have to install the package before using it.

install.packages("officer")

Second, use the examples in the GitHub site and other sites as the start point to create your own PowerPoint slide. I found this site (http://lenkiefer.com/2017/09/23/crafting-a-powerpoint-presentation-with-r/) is really useful.

Third, get around bugs. When I used it to automate the generation of PowerPoint slides, I found that all slides with images inserted did not work at all. After PowerPoint files were generated, they cannot be open correctly. Always popup error message and let you “repair” the presentations. I searched several hours and did not get any useful information. However, I visited “officer” package’s GitHub page and found a hit in bug submitting area. The reason popping up error message is a bug in the program. All image filenames cannot include “SPACE”, otherwise, you get error.

So, I went ahead and changed my program to make sure there is no “space”s in any image filenames. Waoh-lah, it works great.

Another useful point is the slide size. If you want to insert an image to occupy the whole 9:16 widescreen slide, you should use the following parameters:

left = 0.0
top = 0.0
width = 40.0/3.0
height = 7.

Use this set of parameters in ph_with_img_at function, you can guarantee inserted images take the whole slide.

Enjoying to use the package. Thank the developer sharing the wonderful art of work with us.

Share

Tags: , , , ,

Using rsqlserver package in R

In R programming environment, you can conveniently connect to SQL Server through RSQLServer package. However, it is little bit tricky. Not like to way you connect MySQL server through RMySQL package. I searched RSQLServer on Google and tried to find a good example to connect to the SQL server. At the beginning, nothing worked. One time I read a post and used the method in the post. I was able to connect to the SQL Server. However, the connection opened a system database instead of the database I wanted to connect. Eventually I use R help command to learn how to use dbConnect function. That helps figuring out the correct method to connect to the right database on the SQL server.

1) Install the package: RSQLServer. Run command in R:


install.packages("RSQLServer")

2) To use the package, you only need to use the following R command:


library(DBI)

3) To connect to a specific database on a given SQL Server:


con < - dbConnect( RSQLServer::SQLServer(), server="localhost", database = "yourdatabasename", properties=list(user="yourusername", password="yourpassword") )

That should work well for you. You have to provide your SQL Server IP address, database name, your login information to the function.

One get get connected, to query the database table and manipulate data in the database, you can use all the functions available in DBI package. There is no more tricks.

Share

Tags: , , ,

Conversion of list of dictionary and dataframe in Python

In Python there is a dataframe package: pandas. The dataframe package simplifies a lot of things and it is comparable to a dataframe in R. Sometime you need convert a list of dictionary to a dataframe and vice versa.

To use dataframe you need import pandas.

To convert a list of dictionary to a dataframe, use the following:

pandas.DataFrame(a_list_dict)

To convert a dataframe to a list of dictionary, use the following:

list(df.T.to_dict().values())

Share

Tags: , , ,

Clustering in R

There is a nice post at https://stackoverflow.com/questions/15376075/cluster-analysis-in-r-determine-the-optimal-number-of-clusters that provide a number of examples to illustrate how to determine number of clusters.

Below are two examples:

# prepare data from a dataframe
# var1 includes one group
# var2 includes second group
# var3 includes all the values used in clustering
# transpose the dataframe data to matrix
kmdata <- acast(my.data, var1 ~ var2, value.var='var3')
# kmeans
my.cluster <- cascadeKM(kmdata, inf.gr = 1, sup.gr = nrow(kmdata)-1)
plot(my.cluster, sortg = TRUE, grpmts.plot = TRUE)
calinski.best <- as.numeric(which.max(my.cluster$results[2,]))
cat("Calinski criterion optimal number of clusters:", calinski.best, "\n")
# sum of square error
wss <- (nrow(kmdata)-1)*sum(apply(kmdata,2,var))
for (i in 2:(nrow(kmdata)-1)) wss[i] <- sum(kmeans(kmdata,
centers=i)$withinss)
plot(1:(nrow(kmdata)-1), wss, type="b", xlab="Number of Clusters",
ylab="Within groups sum of squares")
my.cluster2 <- kmeans(kmdata, 4)

Here is another nice article talking about clustering (http://www.sthda.com/english/wiki/cluster-analysis-in-r-unsupervised-machine-learning)

Share

Tags: , , , ,