combn - Performing a function on all possible combinations of a subset of DF columns in R -
i'd calculate distance between row-wise pairs of lat/long coordinates. done variety of functions earth.dist. stuck i'd part of nightly data quality check process number of pairs changes. each row unique subject/person. days few subjects have 4 sets of coordinates, days largest might three. there elegant way perform calculate using, e.g., of possible combinations formed by:
combn(geototal, 2])
, geototal number of coordinate sets on given day, e.g. x = 4 set:
latitude.1, longitude.1, latitude.2, longitude.2, latitude.3, longitude.3 latitude.4, longitude.4.
my current loop looks of course misses many possible combinations, esp. x gets larger 4.
x = 1; y = 2 while(x <= geototal) { if (y > geototal) break; eval(parse(text = sprintf("df$distance%d_%d = earth.dist(longitude.%d,latitude.%d,longitude.%d,latitude.%d)", x, y, x, x, y, y))); x <- x + 1; y <- y + 1; }
thank thoughts on this!
try this
# using built in dataset library(fossil) data(fdata.lats) df = fdata.lats@coords # function calculate pairwise distance foo = function(df) { # find number of pairs n = nrow(df) # find combination l = t(combn(n, 2)) # loop on combination , calculate distance, store output in vector t = apply(l, 1, function(x) {earth.dist(df[x,])}) # return list of pairs , distance, modify here if want print instead cbind(l, t) } # test run foo(df) t [1,] 1 2 893.4992 [2,] 1 3 776.3101 [3,] 1 4 1101.1145 [4,] 1 5 1477.4800 [5,] 1 6 444.9052 [6,] 1 7 456.5888 [7,] 1 8 1559.4614 [8,] 1 9 1435.2985 [9,] 1 10 1481.0119 [10,] 1 11 1152.0352 [11,] 1 12 870.4960 [12,] 2 3 867.2648 [13,] 2 4 777.6345 [14,] 2 5 860.9163 ...
Comments
Post a Comment