Posts Tagged dataframe

Conversion of list of dictionary and dataframe in Python

In Python there is a dataframe package: pandas. The dataframe package simplifies a lot of things and it is comparable to a dataframe in R. Sometime you need convert a list of dictionary to a dataframe and vice versa.

To use dataframe you need import pandas.

To convert a list of dictionary to a dataframe, use the following:


To convert a dataframe to a list of dictionary, use the following:



Tags: , , ,

Clustering in R

There is a nice post at that provide a number of examples to illustrate how to determine number of clusters.

Below are two examples:

# prepare data from a dataframe
# var1 includes one group
# var2 includes second group
# var3 includes all the values used in clustering
# transpose the dataframe data to matrix
kmdata <- acast(, var1 ~ var2, value.var='var3')
# kmeans
my.cluster <- cascadeKM(kmdata, = 1, = nrow(kmdata)-1)
plot(my.cluster, sortg = TRUE, grpmts.plot = TRUE) <- as.numeric(which.max(my.cluster$results[2,]))
cat("Calinski criterion optimal number of clusters:",, "\n")
# sum of square error
wss <- (nrow(kmdata)-1)*sum(apply(kmdata,2,var))
for (i in 2:(nrow(kmdata)-1)) wss[i] <- sum(kmeans(kmdata,
plot(1:(nrow(kmdata)-1), wss, type="b", xlab="Number of Clusters",
ylab="Within groups sum of squares")
my.cluster2 <- kmeans(kmdata, 4)

Here is another nice article talking about clustering (


Tags: , , , ,