I'm creating a function in R such that if A and B are vectors. Such that B is a subset of A.
And A and B may contain repeating elements.
Then (A-B) has the remaining elements.
I started by using setdiff() but its not applicable for repeating elements.
d<-c(1,1,1,5,5,5,3,0,10,10)
b<-c(1,1,0)
e<-setdiff(d,b)
e
[1] 5 3 10
Instead it should be:
c(1,5,5,5,3,10,10)
Since I was getting error, I created a new function such that:
my.sample<-function(d,b){
y<-numeric()
u<-numeric()
t<-list()
x<-numeric()
rd<-rle(d)
rb<-rle(b)
h<-numeric()
d.data<-data.frame(rd$lengths,rd$values)
b.data<-data.frame(rb$lengths,rb$values)
for(i in 1:nrow(b.data)){
y[i]<-b.data[i,2]
u[i]<-b.data[i,1]
h[i]<-(d.data[d.data$rd.values==y[i],1]-u[i])
d.data[d.data$rd.values==y[i],1]<-h[i]
}
x<-d.data[,1]
for(j in 1:length(x))
{
t[[j]]<-rep(d.data[j,2],x[j])
}
return(unlist(t))
}
Then I tried:
my.sample(d,b)
[1] 1 5 5 5 3 10 10
This works for that specific problem, but when I tried using it to another more complicated vector like:
x<-rpois(100,10)
y<-sample(x,25,replace=F)
my.sample(x,y)
Error in rep(d.data[j, 2], x[j]) : invalid 'times' argument
In addition: There were 21 warnings (use warnings() to see them)
I get this error