09:09:40 De Simppa Äkäslompolo : Q: sin-cos and matmul: How could I have known beforehand if it is better to parallelize the inner or outer loop? (I tested and your solution was best!) --- Is there a rule-of-thumb? 09:10:33 De Pierpaolo Minelli : Is it possible to use runtime schedule for this example? 09:13:06 De Tiago Ribeiro : and as a add-on to the previous question, what about opening the parallel region in the outer-loop and using #pragma omp for in the inner one? 09:15:15 De Virginie Grandgirard : why you don't try collapse(3) ? 09:15:40 De Serhiy Mochalskyy : i, j,k should not be private? 09:16:01 De Pierpaolo Minelli : Why n is declared firstprivate instead of shared? 09:37:06 De Dimitrios Peponis (NCSRD/NKUA) : Are the tasks created once and reused or at every iteration new tasks are created? This would probably introduce significant overhead. 09:38:19 De Michael Klemm : At each iteration a new task is created, as it has to carry its own data environment (in the example, the list item). So, yes, there's overhead involved with that. We will cover how to reduce the overhead at the end of today's webinar 10:23:32 De Dimitrios Peponis (NCSRD/NKUA) : Thank you for the answer. Is there a way to create tasks, make them wait and use them during execution at our will? For example, create tasks 1,2,3 and then use 2 followed by 3. 10:24:44 De Christian Terboven : You can order tasks via dependencies: make t 3 wait for t 2. 10:24:57 De Christian Terboven : #pragma omp task depend(out: x) // t2 10:25:07 De Christian Terboven : #pragma omp task depend(in: x) // t3 10:25:54 De Christian Terboven : Please note that if you use dependencies at this fine level, you might significantly limit the degree of parallelism. 10:27:48 De Dimitrios Peponis (NCSRD/NKUA) : Thank you one more time. Basically, my first idea is to use tasks to "guide" different functions, with vast of indepedent data. For example, assign foo() to task (or thread) 1, bar() to task (or thread) 2 and so forth. Something similar with sections I assume. 10:28:46 De Christian Terboven : Are foo() and bar() independent? I would not recommend to manually bind to threads at that level, because the resulting program would be very dependent on the number of threads use. Processors will have more and more cores in the future. 10:28:49 De Christian Terboven : However, you could write: 10:28:59 De Christian Terboven : #pragma omp parallel num_threads(X) 10:29:12 De Dimitrios Peponis (NCSRD/NKUA) : Yes, they are indepedent 10:29:13 De Christian Terboven : if (omp_get_thread_num() == 0) foo(); 10:29:15 De Christian Terboven : else 10:29:17 De Christian Terboven : bar() 10:31:15 De Dimitrios Peponis (NCSRD/NKUA) : Yes, something like that 10:37:18 De Nicola Varini : could you share the code? 10:38:20 De Christian Terboven : Which code are you interested in? 10:38:37 De Nicola Varini : the gauss sidel task dependence example 10:38:48 De Nicola Varini : would be interesting to reproduce the performance