r - kmeans clustering on the basis of fixed number of variables out of all variables -

i beginner in r , data analysis.i have data-set of around 2500 rows 7 columns .i want cluster data-set 15 centers on basis of first 2 columns(keeping other columns intact clustered-data-set.

i need display clustered data-set sorted on basis of third column.

can me required syntax ? let csv file name locdata.csv , first 2 columns "lat" , "lon" , third column "date".

this should there.

first create dataset (alternatively, import csv file):

set.seed(1) df <- data.frame(matrix(rnorm(n=10000, mean=10, sd=20), ncol=8)) names(df)[1:3] <- c("lat", "lon", "date") # use df <- read.csv(..) instead load file  require(dplyr) cluster.df <- select(df, lat, lon) # select columns cluster on km <- kmeans(cluster.df, 15)

next can extract clusters, using fact kmeans retains original order:

# extract clusters , add them original data frame df$cluster = km$cluster  # sort on whatever column prefer df %>%   arrange(date, cluster)

Search This Blog

Szoka

r - kmeans clustering on the basis of fixed number of variables out of all variables -

Comments

Post a Comment

Popular posts from this blog

facebook - android ACTION_SEND to share with specific application only -

python - Creating a new virtualenv gives a permissions error -

go - Idiomatic way to handle template errors in golang -