Dynamically modify values in data.table in R but getting only single constant value returned -
i trying modify values in data.tables in r. have come command thought job, instead puts same value in every row, while should dependent on information in row. here command:
tablea[,cola := tableb[colbd == unlist(strsplit(unlist(strsplit(colc," +"))[1], "-"))[1] & colbc == unlist(strsplit(unlist(strsplit(colc," +"))[1], "-"))[2] & colbb == unlist(strsplit(unlist(strsplit(colc," +"))[1], "-"))[3] ,colba]]
so want cola tablea take value of colba tableb corresponds date colc in tablea. example tablea
cola colb colc 1: 600 93 2012-12-14 23:45:00 2: 600 93 2012-12-15 23:45:00 3: 600 93 2012-12-14 23:45:00 4: 600 93 2012-12-15 23:45:00 5: 600 93 2012-12-14 23:45:00 6: 600 93 2012-12-15 23:45:00 7: 600 93 2012-12-14 23:45:00 8: 600 93 2012-12-15 23:45:00 9: 600 93 2012-12-14 23:45:00 10: 600 93 2012-12-15 23:45:00
this relevant part tableb:
colba colbb colbc colbd 1: 600 14 12 2012 2: 601 15 12 2012
this resulting tablea should like
cola colb colc 1: 600 93 2012-12-14 23:45:00 2: 601 93 2012-12-15 23:45:00 3: 600 93 2012-12-14 23:45:00 4: 601 93 2012-12-15 23:45:00 5: 600 93 2012-12-14 23:45:00 6: 601 93 2012-12-15 23:45:00 7: 600 93 2012-12-14 23:45:00 8: 601 93 2012-12-15 23:45:00 9: 600 93 2012-12-14 23:45:00 10: 601 93 2012-12-15 23:45:00
i hope clear enough!
so using information @davidarenburg code more efficient using as.date()
, isodate
functions, didn't know use on data. command become this:
tablea[,cola := tableb[as.date(colc) == as.date(isodate(colbd,colbc,colbb)), colba]]
this works (for testset, not larger sample of data), gives me warning:
warning message: in `==.default`(as.date(colc), as.date(isodate(colbd, colbc, : longer object length not multiple of shorter object length
you doing inefficiently. not how should work data.table
.
here's 1 efficient way solve (there more efficient ways)
setkey(tableb[, indx := as.date(isodate(colbd, colbc, colbb))], indx) setkey(tablea[, indx := as.date(colc)], indx) tablea[tableb, cola := colba][, indx := null][] # cola colb colc # 1: 600 93 2012-12-14 23:45:00 # 2: 600 93 2012-12-14 23:45:00 # 3: 600 93 2012-12-14 23:45:00 # 4: 600 93 2012-12-14 23:45:00 # 5: 600 93 2012-12-14 23:45:00 # 6: 601 93 2012-12-15 23:45:00 # 7: 601 93 2012-12-15 23:45:00 # 8: 601 93 2012-12-15 23:45:00 # 9: 601 93 2012-12-15 23:45:00 # 10: 601 93 2012-12-15 23:45:00
here creating index
column in each data set (only once , assigning reference, instead of calling multiple times) used merging while assigning correct values in colba
cola
if insist on retaining original rows order, can modify code following
setkey(tableb[, indx := as.date(isodate(colbd, colbc, colbb))], indx) setkey(tablea[, `:=`(indx = as.date(colc), order = .i)], indx) setorder(tablea[tableb, cola := colba], order)[, `:=`(indx = null, order = null)][] # cola colb colc # 1: 600 93 2012-12-14 23:45:00 # 2: 601 93 2012-12-15 23:45:00 # 3: 600 93 2012-12-14 23:45:00 # 4: 601 93 2012-12-15 23:45:00 # 5: 600 93 2012-12-14 23:45:00 # 6: 601 93 2012-12-15 23:45:00 # 7: 600 93 2012-12-14 23:45:00 # 8: 601 93 2012-12-15 23:45:00 # 9: 600 93 2012-12-14 23:45:00 # 10: 601 93 2012-12-15 23:45:00
Comments
Post a Comment