c++ - User Defined Reduction on vector of varying size -
i'm trying define own reduction vectors of complex<float>, following this answer question reducing on array in openmp.
but size of vectors aren't fixed @ compile time, i'm not sure how define initializer vector in declare reduction
pragma. is, can't have
initializer( omp_priv=tcomplexvector(10,0) )
but initializer needed vectors.
how can pass initializer clause size of vector need @ run time? have far below:
typedef std::vector<complex<float>> tcmplxvec; void complexadd(tcmplxvec & x,tcmplxvec & y){ (int i=0;i<x.size();i++) { x.real()+= y.real(); //... same imaginary part , other operations } } #pragma omp declare reduction(addcmplx: tcmplxvec: \ complexadd(&omp_out, &omp_in)) initializer( \ omp_priv={tcmplxvec(**here want variable length**,0} ) void dosomeoperation () { //tcmplxvec vec empty , anothervec not //so each thread runs inner loop serially #pragma omp parallel reduction(addcmplx: vec) ( n=0 ; n<10 ; ++n ) { (m=0; m<=somelength; ++m){ vec[m] += anothervec[m+someoffset dependend on n , else]; } } }
you have dig little bit find online right now, in section 2.15 of openmp standard, user-declared reductions discussed, you'll find "the special identifier omp_orig can appear in initializer-clause , refer storage of original variable reduced."
so use initializer (omp_priv=tcmplxvec(omp_orig.size(),0))
, or initalizer ( omp_priv(omp_orig) )
initialize vector in reduction.
so following works (note don't need write own routine; can use std::transform , std::plus add vectors; use std::valarray rather vectors, depending on how use them, has operator+ defined):
#include <complex> #include <vector> #include <algorithm> #include <functional> #include <iostream> #include <omp.h> typedef std::vector< std::complex<float> > tcmplxvec; #pragma omp declare reduction( + : tcmplxvec : \ std::transform(omp_in.begin( ), omp_in.end( ), \ omp_out.begin( ), omp_out.begin( ), \ std::plus< std::complex<float> >( )) ) \ initializer (omp_priv(omp_orig)) int main(int argc, char *argv[]) { int size; if (argc < 2) size = 10; else size = atoi(argv[1]); tcmplxvec result(size,0); #pragma omp parallel reduction( + : result ) { int tid=omp_get_thread_num(); (int i=0; i<std::min(tid+1,size); i++) result[i] += tid; } (int i=0; i<size; i++) std::cout << << "\t" << result[i] << std::endl; return 0; }
running gives
$ omp_num_threads=1 ./reduction 8 0 (0,0) 1 (0,0) 2 (0,0) 3 (0,0) 4 (0,0) 5 (0,0) 6 (0,0) 7 (0,0) $ omp_num_threads=4 ./reduction 8 0 (6,0) 1 (6,0) 2 (5,0) 3 (3,0) 4 (0,0) 5 (0,0) 6 (0,0) 7 (0,0) $ omp_num_threads=8 ./reduction 8 0 (28,0) 1 (28,0) 2 (27,0) 3 (25,0) 4 (22,0) 5 (18,0) 6 (13,0) 7 (7,0)
Comments
Post a Comment