Technical Collaborations
Make sure you have the latest version of this memo.
Version Date: September 22, 2016.
Type of Memo: Informative
Introduction
This is a short list of useful things for our collaboration. It
is not intended to substitute communication but to improve its
efficiency. Comments and criticisms about this document are
welcomed.
Remote communication and Netiquette guide PLEASE READ AND ADHERE
TO Netiquette guidelines
Basic Knowledge
Here are basic resources that you need to know well. Links
provided are merely pointers, unless otherwise indicated.
- Linux platform. http://tldp.org/LDP/intro-linux/html/ In
addition, you will need to familiarize yourself with the
distribution you will be using.
- Basic bash.
http://www.tldp.org/LDP/Bash-Beginners-Guide/html/
- git. We will use git for all our codebases. There's tons of
documentation about git out there: http://git-scm.com/documentation
and don't miss the full
online book
- C Programming Language or Assembly. You need to understand
computers from the ground up. Highly recommended: Jonathan
Bartlett: Programming
from the Ground Up. Bartlett Publishing, 2004. ISBN
0-9752838-4-7
- Using make. Just the basic usage, run "man make," or search for
a tutorial.
- Basic C++. I know C++ is too involved and big, but you will
need at least to know the "Hello, world" in C++, that is, the big picture.
Additional Knowledge
Depending on what your work with me will entail, you might need
the following additional knowledge.
- Code conventions. If you do any programming with me, please
adhere to our coding
style.
- Perl. If you are going to do scripting, it'll probably be done
in perl. Feel free to ask why not Python or Ruby or XYZ. But if you
don't have an urge to ask the question, then that's OK too.
Scripting is not an excuse for sloppy programming or bad style.
Adherence to our coding
style or modified conventions is expected, as long as possible
and practical.
- The D Programming Language. We'll start using the D Programming
Language as a substitute for perl as soon as gdc is merged into
gcc.
- C++ templates and C++ virtual inheritance. Please know your C++. C++ is a
vast land. Aquiring knowledge in this area could take years. If
your assignment includes C++ coding you'll need to know at least
C++ templates. How much do you need to know? Make sure you can
understand everything in our coding style
guide. There's the legal---what can be done---, and
moral---what should be done---aspect to C++. You can probably learn
both in books. Stroustrup's TC++PL is a must as a reference.
Alexandrescu's ``Modern C++ Design'' is excellent for virtual
inheritance, and contains a good amount of templates, but perhaps
not enough. There is a book by N. Josuttis, I think, that should
cover templates extensively. He's got another one dealing with the
STL. As for the moral part, let me mention Alexandrescu and Sutton,
``C++ Coding Standards,'' which I used to write CodingStyle.pdf.
(Note: You definitely don't need to buy books for the work you will
be doing with me. You can find them in the library, or we can use
the notes.) In number-crunching codes, performance is an issue, and
virtual inheritance (run-time dispatch) needs to be used with care.
That's why we emphasize templates (compile-time dispatch).
- Using gdb Learn to use the debugger. Because even though it's
helpful if you say "I get 'Segmentation fault' when I run the xyz
code," it's more helpful if you say "Please consider pulling this
changeset that fixes a segfault in the xyz code."
- Valgrind memcheck (because sometimes the debugger isn't
enough). And it won't be enough more often than you might think.
Because sometimes the debugger itself crashes, which you might
think comic now, but it isn't when you have a bug to hunt. And
sometimes the hardware conspires against you, and segfaults are
delayed until they show up in unexpected places, usually at
deallocation. In these cases, gdb will likely point you to the
wrong place.
- Profiling (at least with valgrind --tool=callgrind ) It's
helpful if you say "The xyz code runs too slow." It's more helpful
if you say "Function f(...) in the xyz code takes too much time."
And it's most helpful of all if you say "Please consider pulling
this changeset that fixes an unnecessary slowdown in function
f(...)."
- And there are more valgrind tools...
- Code documentation. If you need to document code, please do
read documentation
guidelines.
Writing and Visualization
This may be important in order to present your work.
- LaTeX. If your
assignment requires you to write something that (i) will need
formatting, and (ii) will require my editing, then please use LaTex
or plain TeX. Or ConTeX. Or LuaTeX.
- Figures For diagrams, sketches, and plots, please don't use
raster images when vector graphics will do. Use of pgfplots and
tikz is preferred.
- To publish a paper that includes me as an author you must
follow requirementsBeforePublication.txt (separate document
provided on request), before I can consider your paper.
- Talks I strongly recommend latex beamer.
- Ray Tracing - https://github.com/mmp/pbrt-v2 -
https://bitbucket.org/luxrender/lux
Supercomputers and computer clusters
If you need to run on supercomputers and computer clusters, then
you will need the following:
- module command In most supercomputers, software is handled by
module switches. Please learn them.
- Portable Batch System or whatever queue software is used on the
supercomputer/cluster. Please learn about queues and walltimes and
the difference between nodes and cores. And processors and
threads.
- LAPACK and BLAS libraries Please try to understand the linker
error, 'ld: cannot find -llapack', if you get one. Goes without
saying, please learn what a linker is, and you should have anyway,
when you read the item ``C Programming Language or Assembly''
above.
- Message Passing Interface (MPI) is needed if using distributed
memory parallelization.
- If using shared memory parallelization please learn and use
``pthreads''. No OpenMP, please.
Linear Algebra Knowledge
This will depend on your task, but most of my work deals with
quantum mechanics and quantum field theory in condensed matter.
Therefore, to understand the math involved, at the very minimum you
will need:
- Vector Spaces, Eigenvalues, Eigenvectors
- Hilbert Spaces, Dirac Notation (*), Outer Products (*) Dirac
notation is only rigorous for finite dimensional Hilbert spaces.
But that's all we'll need.
- Sparse Linear Algebra (Compressed row storage, Lanczos
algorithm)