Wednesday, October 9, 2013

plyr parallel backend using doSNOW

You often want to parallelize some "embarrassingly parallel" computations. Doing this with R can be simple.

library(MASS)
library(plyr)
library(doSNOW)
cl <- makeCluster(8,type='SOCK')
registerDoSNOW(cl)
clusterEvalQ(cl, library(MASS))
set.seed(123)
x <- laply(1:15,function(x){mvrnorm(n=10000,mu=c(0,0),Sigma=matrix(c(1,0.9,0.9,1),2,2))},.parallel=TRUE)
alply(x,1,function(x){var(x)},.parallel=TRUE)
stopCluster(cl)
Executing the above code generates the following output
$`1`
          1         2
1 0.9949429 0.8972545
2 0.8972545 1.0001936

$`2`
          1         2
1 1.0088491 0.9009547
2 0.9009547 0.9927314

$`3`
          1         2
1 1.0006708 0.9000211
2 0.9000211 1.0036544

$`4`
          1         2
1 0.9842565 0.8848799
2 0.8848799 0.9885273

$`5`
          1         2
1 0.9965336 0.8995234
2 0.8995234 0.9973838

$`6`
          1         2
1 1.0207631 0.9237891
2 0.9237891 1.0221271

$`7`
          1         2
1 1.0119796 0.9018527
2 0.9018527 0.9990797

$`8`
          1         2
1 1.0008902 0.9052643
2 0.9052643 1.0094707

$`9`
          1         2
1 0.9995138 0.8899004
2 0.8899004 0.9836347

$`10`
          1         2
1 1.0270531 0.9240547
2 0.9240547 1.0211015

$`11`
          1         2
1 1.0027423 0.9018306
2 0.9018306 0.9993059

$`12`
          1         2
1 0.9946847 0.8903604
2 0.8903604 0.9904657

$`13`
          1         2
1 1.0088730 0.9061493
2 0.9061493 1.0029127

$`14`
          1         2
1 0.9895536 0.8938813
2 0.8938813 0.9967428

$`15`
          1         2
1 0.9999374 0.9005808
2 0.9005808 1.0010201

attr(,"split_type")
[1] "array"
attr(,"split_labels")
   X1
1   1
2   2
3   3
4   4
5   5
6   6
7   7
8   8
9   9
10 10
11 11
12 12
13 13
14 14
15 15
Taken from this stackoverflow page

1 comment: