csv - Getting respective columns for the unique records in r -
i have large csv file millions of records , 6 columns . want unique records of 1 column "name" , columns associated unique records in "name". 50,000 unique "name" records want other 5 columns associated 50,000 records. know how unique records in column. in code below filter out name column(1st column) want separate data frame , return unique records using unique function. not sure how other 5 columns unique records.
m <- read.csv(file="test.csv", header=t, sep=",", colclasses = c("character","null","null","null","null","null")) names <- unique(m, incomparables = false)
yes, others unique w.r.t. 1st column. if same name has repeated , have different entries in at-least 1 of other 5 columns, row count unique one.
m <- read.csv(file="test.csv", header=t, sep=",", colclasses = c("character","null","null","null","null","null")) m <- unique(m) #remove duplicates subset <- m[1:50000,] #subset first 50000 rows
refer following links better understanding:
https://stat.ethz.ch/r-manual/r-devel/library/base/html/unique.html
Comments
Post a Comment