csv - Getting respective columns for the unique records in r -

i have large csv file millions of records , 6 columns . want unique records of 1 column "name" , columns associated unique records in "name". 50,000 unique "name" records want other 5 columns associated 50,000 records. know how unique records in column. in code below filter out name column(1st column) want separate data frame , return unique records using unique function. not sure how other 5 columns unique records.

m <- read.csv(file="test.csv", header=t, sep=",",                colclasses = c("character","null","null","null","null","null")) names <- unique(m, incomparables = false)

yes, others unique w.r.t. 1st column. if same name has repeated , have different entries in at-least 1 of other 5 columns, row count unique one.

m <- read.csv(file="test.csv", header=t, sep=",", colclasses = c("character","null","null","null","null","null")) m <- unique(m) #remove duplicates subset <- m[1:50000,] #subset first 50000 rows

refer following links better understanding:

https://stat.ethz.ch/r-manual/r-devel/library/base/html/unique.html

unique on dataframe selected columns

Search This Blog

Remember

csv - Getting respective columns for the unique records in r -

Comments

Post a Comment

Popular posts from this blog

Java 8 + Maven Javadoc plugin: Error fetching URL -

css - SVG using textPath a symbol not rendering in Firefox -

php - Google Calendar Events -