1 |
From Bron Nelson, February 28, 2019 |
2 |
|
3 |
In the newest version, it is no longer necessary to hand-edit the |
4 |
constants in "recvTask.c" and "readtile_mpiio.c". Instead, the file |
5 |
"SIZE.h" has been modified in two ways: |
6 |
(1) SIZE.h now includes the constant "sFacet" |
7 |
(2) SIZE.h may now be #include in both C and Fortran files |
8 |
This means that "recvTask.c" and "readtile_mpiio.c" now get the |
9 |
information they need directly from "SIZE.h", so the magic constants |
10 |
for the run only need to be edited in one place (namely, SIZE.h). |
11 |
|
12 |
One tile per rank is recommended, mostly for pickup input performance, |
13 |
but it is not strictly necessary. A minimum of one full node of I/O |
14 |
ranks is required. The async I/O does allocate whole nodes to be |
15 |
either an I/O node, or a compute node. It is permitted for the last |
16 |
*compute* node to have a "ragged edge", i.e., have fewer MPI processes |
17 |
on it than the other nodes do. But the I/O nodes are all "full size". |
18 |
|
19 |
The other minimum value the I/O code requires is that there must |
20 |
be at least one core for each field you want to write, e.g., if you |
21 |
are dumping 20 different fields, there must be at least 20 cores |
22 |
allocated to the I/O. Note that the 20 (or whatever) number is |
23 |
*aggregate* across all the I/O nodes, NOT a "per node" number. |
24 |
|
25 |
Another important constraint is that the total memory on all the I/O |
26 |
nodes *collectively* needs to be twice as big as the largest epoch you |
27 |
write. So, if you are writing a 1.5 TB pickup dump, then you should |
28 |
have a sum total of 3TB of memory (or more) on the set of I/O nodes. |
29 |
|
30 |
Choose dumpFreq and pChkptFreq as usual. We're not set up |
31 |
to do the rolling checkpoints yet. It'll dump u,v,t, and etan now - |
32 |
send me a list of other fields you want, as it is rather involved |
33 |
to change them. But this should be enough to see if it works. |
34 |
|
35 |
Set run-time parameter: useSingleCPUio=.FALSE. |
36 |
|
37 |
Only a couple of files are different from previous version. |
38 |
But note in particular that "SIZE.h" is a new file in that directory, |
39 |
and "recvTask.c" has a huge number of changes. |
40 |
|
41 |
The input scheme implemented here is only invoked on |
42 |
the 64bit pickup files. It is specific to the LLC decomposition and will |
43 |
not work on e.g. the Monterey high-res simulations we did a couple years |
44 |
ago. (Although, the code should work for any facet size as specified |
45 |
in SIZE.h) The format of SIZE.h was changed so that it can be included |
46 |
in both C and Fortran files, and I also added the "sFacet" constant that |
47 |
specifies the base facet size (e.g. 1080). So SIZE.h will probably look |
48 |
kinda weird at first, but shouldn't be hard to figure out. The major |
49 |
advantage is that now you no longer need to edit any magic constants in |
50 |
recvTask.c and readtile_mpiio.c - they now derive the info they need by |
51 |
directly including SIZE.h |
52 |
|
53 |
The code now automatically figures out how many ranks are running per node. |
54 |
You can run with whatever number of ranks per node that you want, but the |
55 |
number needs to be consistent for all nodes (except possibly the last node, |
56 |
which can be short). |
57 |
|
58 |
The initial burst of output generated by recvTask.c (the "map" describing |
59 |
the way the I/O processes are allocated) is now somewhat longer and more |
60 |
detailed, but can continue to be ignored. |
61 |
|
62 |
I did NOT try to cure the "integer" problem. It seems that the code is |
63 |
getting fairly close to bumping into the 2G (i.e. 2^31) limit on numbers |
64 |
that fit into a default integer. I *think* you can probably do one more |
65 |
doubling of the resolution (to 8640), but I'm also pretty sure that going |
66 |
past that will break the code. |