How many workers do you choose when running a parallel job in Julia? The answer is easy right? The number of physical cores. We always default to that number. For my Core i7 4770K, that means it's 4, not 8 since that would include the hyperthreads. On my FX8350, there are 8 cores, but only 4 floating-point units (FPUs) which do the math, so in mathematical projects, I should use 4, right? I want to demonstrate that it's not that simple.
Where the Intuition Comes From
Most of the time when doing scientific computing you are doing parallel programming without even knowing it. This is because a lot of vectorized operations are "implicitly paralleled", meaning that they are multi-threaded behind the scenes to make everything faster. In other languages like Python, MATLAB, and R, this is also the case. Fire up MATLAB ... READ MORE
(Disclaimer: This is not a full-Julia solution for using the Phi, and instead is a tutorial on how to link OpenMP/C code for the Xeon Phi to Julia. There may be a future update where some of these functions are specified in Julia, and Intel's compilertools.jl looks like a viable solution, but for now it's not possible.)
Intel's Xeon Phi has a lot of appeal. It's an instant cluster in your computer, right? It turns out it's not quite that easy. For one, the installation process itself is quite tricky, and the device has stringent requirements for motherboard choices. Also, making out at over a taraflop is good, but not quite as high as NVIDIA's GPU acceleration cards.
However, there are a few big reasons why I think our interest in the Xeon Phi should be renewed. For one, Intel ... READ MORE
This is the exciting Part 3 to using Julia on an HPC. First I got you started with using Julia on multiple nodes. Second, I showed you how to get the code running on the GPU. That gets you pretty far. However, if you got a trial allocation on Cometand started running jobs, you may have noticed when looking at the architecture that you're not getting to use the full GPU. In the job script I showed you, I asked for 2 GPUs. Why? Well, that's because the flagship NVIDIA GPU, the Tesla K80, is actually a duel GPU and you have to control the two parts separately. You may have been following along on your own computer and have been wondering how you use the multiple GPUs in your setup as well. This tutorial will ... READ MORE
Today I am going to show you how to parallelize your Julia code over some standard HPC interfaces. First I will go through the steps of parallelizing a simple code, and then running it with single-node parallelism and multi-node parallelism. The compute resources I will be using are the XSEDE (SDSC) Comet computer (using Slurm) and UC Irvine's HPC (using SGE) to show how to run the code in two of the main job schedulers. You can follow along with the Comet portion by applying and getting a trial allocation. READ MORE
It can sometimes be quite daunting to get the information you need. When looking for the right HPC to run code on, there are a lot of computers in the US to choose from. I decided to start compiling a lot of the information into some tables in order to make it easier to understand the options. I am right now starting with a small subset which includes Blue Waters and some of XSEDE with rows for the parts that interest me, but if you would like for me to add to the list please let me know. READ MORE