Posts Tagged transpose

Clustering in R

There is a nice post at https://stackoverflow.com/questions/15376075/cluster-analysis-in-r-determine-the-optimal-number-of-clusters that provide a number of examples to illustrate how to determine number of clusters.

Below are two examples:

# prepare data from a dataframe
# var1 includes one group
# var2 includes second group
# var3 includes all the values used in clustering
# transpose the dataframe data to matrix
kmdata <- acast(my.data, var1 ~ var2, value.var='var3')
# kmeans
my.cluster <- cascadeKM(kmdata, inf.gr = 1, sup.gr = nrow(kmdata)-1)
plot(my.cluster, sortg = TRUE, grpmts.plot = TRUE)
calinski.best <- as.numeric(which.max(my.cluster$results[2,]))
cat("Calinski criterion optimal number of clusters:", calinski.best, "\n")
# sum of square error
wss <- (nrow(kmdata)-1)*sum(apply(kmdata,2,var))
for (i in 2:(nrow(kmdata)-1)) wss[i] <- sum(kmeans(kmdata,
centers=i)$withinss)
plot(1:(nrow(kmdata)-1), wss, type="b", xlab="Number of Clusters",
ylab="Within groups sum of squares")
my.cluster2 <- kmeans(kmdata, 4)

Here is another nice article talking about clustering (http://www.sthda.com/english/wiki/cluster-analysis-in-r-unsupervised-machine-learning)

Share

Tags: , , , ,

MySQL transpose function of aggregation

An article published at the following providing a lot of examples.

http://www.artfulsoftware.com/infotree/queries.php

At the stackoverflow website there is an example as below. It helps me a lot.

The following text from the site.

Question:

Here is what my current mysql table looks like:

PunchID EmpID PunchEvent PunchDateTime
1 0456 clockin 5/14/2013 8:36:26 AM
48 0456 breakout 5/14/2013 12:01:29 PM
53 0456 breakin 5/14/2013 12:28:31 PM
54 0456 clockout 5/14/2013 2:28:33 PM
57 0456 clockin 5/15/2013 7:38:34 AM
58 0456 breakout 5/15/2013 7:38:39 AM
59 0456 breakin 5/15/2013 7:38:41 AM
60 0456 clockout 5/15/2013 7:38:42 AM
Now I want to return a result set that is grouped based on the day of the week like so:

Day ClockIn BreakOut BreakIn ClockOut
Tuesday 8:36:26 AM 12:01:29 PM 12:28:31 PM 2:28:33 PM
Wednesday 7:38:34 AM etc, etc…
This is my current query. But it only returns the first punch for each day.

Answer:

SELECT DATE_FORMAT(PunchDateTime, ‘%W’) DAY,
MAX(CASE WHEN PunchEvent = ‘ClockIn’ THEN DATE_FORMAT(PunchDateTime, ‘%r’) END) ClockIn,
MAX(CASE WHEN PunchEvent = ‘BreakOut’ THEN DATE_FORMAT(PunchDateTime, ‘%r’) END) BreakOut,
MAX(CASE WHEN PunchEvent = ‘BreakIn’ THEN DATE_FORMAT(PunchDateTime, ‘%r’) END) BreakIn,
MAX(CASE WHEN PunchEvent = ‘ClockOut’ THEN DATE_FORMAT(PunchDateTime, ‘%r’) END) ClockOut
FROM tableName
WHERE EmpID = 456
GROUP BY DATE_FORMAT(PunchDateTime, ‘%W’)
ORDER BY PunchDateTime
SQLFiddle Demo
OUTPUT

╔═══════════╦═════════════╦═════════════╦═════════════╦═════════════╗
║ DAY ║ CLOCKIN ║ BREAKOUT ║ BREAKIN ║ CLOCKOUT ║
╠═══════════╬═════════════╬═════════════╬═════════════╬═════════════╣
║ Tuesday ║ 08:36:26 AM ║ 12:01:29 PM ║ 12:28:31 PM ║ 02:28:33 PM ║
║ Wednesday ║ 07:38:34 AM ║ 07:38:39 AM ║ 07:38:41 AM ║ 07:38:42 AM ║
╚═══════════╩═════════════╩═════════════╩═════════════╩═════════════╝

Share

Tags: , , , ,