## Homework and Exams

All homework should be emailed to me. Always put MA 5310 in the Subject line of your message. I will send you a reply when I get your mail. If you do not get a reply, I did not get your email. You are responsible for having an email account that can communicate successively with me. I am flexible on which of my email accounts you use up to a degree. I will send you email from GMail, however.

All homework should be done individually unless explicitly stated in an individual homework assignment. It is acceptable to ask simple questions of other people, but do not collude or works together closely. All dead sources are acceptable, however.

HomeworkExamDueWorth hw1 1/26/2010 100 hw2 2/4/2010 250 hw3 2/25/2010 300 hw4 3/4/2010 100 hw5 4/12/2010 exam1 3/4-11/2010 Problems 1-3: 300,

Problem 4: 2004/20-22/2010 exam3 5/4/2010 ## Hw1

Since almost no one in the class seems to know C++ and the textbook requires knowledge of C++, you are to write a C++ class that is relatively simple, but not completely trivial and does something of interest computationally. A good example is design a class that implements a dense matrix and lets you references elements of a matrix A by A(i,j) for the element in the i-th row and j-th column.

Ideally it should have

- a header file that another program can include using the #include C++ construct defining

- constructor(s)
- destructor
- copy
- set the value of members of the class
- get the value of members of the class
- print the values of the member(s)
- an implementation file of what is declared in the header file.
- a third file with a main program that uses the class.
You should use g++ if possible.

Postscript: I want you to get the full score, so I am going to iterate until you get all the points. You can decide to get no points, however, but I would not recommend this option.

## Hw2

Investigate the parlib programs and complete the inner_prod++.cpp example. Then write a program called matrixvec++.cpp that does a dense matrix vector multiply in parallel. You should base your program on the C++ examples. Turn in your Makefile, inner_prod++.cpp, and matrixvec++.cpp files. Once again, use mpic++ with g++ if at all possible (since this is what I will use to test your codes).

Postscript: I am starting out with a clean copy of the parlib files, i.e., I do the following before testing each of your submissions:

rm -rf parlibI am copying only your Makefile, innerprod++.cpp, and matrixvec++.cpp files into the directory tree:

mkdir parlib

cd parlib

tar zxf ../parlib.tgzMakefile -> parlibI then change into parlib and type

innerprod++.cpp -> parlib/ExamplesInC++

matrixvec++.cpp -> parlib/ExamplesInC++makeEverything should copy into the parlib directory and make. You are not allowed to change where any file is in the default parlib directory structure. You may not write a brand new Makefile that ignores the given one in parlib (you were required to modify the existing one, not write a new one).Do not be too clever. You will be stuck fixing this assignment until you earn all 250 points and have learned the material. This is an all or nothing assignment. I want you to learn the material, period.

## Hw3

This is a theoretical homework (no programming necessary). You ca library search all of the answers. If you do, cite the literature. There is no penalty for library searched solutions. From the class notes of 2/18/2010, prove

- Lemma CG2 on page 62
- Lemma CG4 on page 62
- Lemma CG5 on page 62
- Theorem 3.9 on page 62
- Theorem 3.9 (should be 3.10) on page 63 (or an equivalent convergence theorem based on the condition number and another norm)
You may turn in handwritten solutions or typed ones. If you submit a document electronically, a pdf file is preferred. Scanning and emailing the solution is another preferred method. If you turn your solution in after class on 2/25, you must submit it electronically.

## Hw4

This provides closure to Hw2. A lot of the class had a common error in the matrix-vector code: A is localized using a block of rows and the vector x is different in each process. The assignment is to

- Write a clear mathematical algorithm for how to generate A in parallel, how you generate x, how you do all of the communications, and finally how to compute Ax correctly in parallel.
- Correct your code from Hw2.
If your x in Hw2 is globally the same in your assignment, you get 100 points automatically. You will receive an email excusing you from Hw4.

## Hw5

Problems from the textbook, section 3.5.1:

- Problem 8
- Problem 13 (use parlib,
MPI routines directly)not- Problem 17 (do as much as you can in four hours and then stop. You should learn enough even from an incomplete assignment)
A good rule of thumb in parallel programming is to do the problem first in Matlab as fast as you can. Then you know the correct answers for serial cases. Some people then make a second version in Matlab that mimics exactly what the parallel code would do in an ideal world. This is probably overkill for this course and assignment, however.

## Exam1

The exam is a take home test. The exam will be available by March 4. It is due at the beginning of class on March 11 (1:20 PM Mountain time). Late exams will not be accepted unless there is a reason that fits the university mandated late policy that is in the university handbook (see the syllabus for a web link). This exam covers material in Chapters 1, 2, 7, and 9 in the textbook.

You are to work on the exam individually without consulting anyone other than me, your professor. You may library search any question on the exam, however. If you find a solution by library searching (physical libraries or the Internet count), cite your source. You will find the updated class notes useful and the textbook essential.

- Develop an algorithm to compute the inner product of two real N-vectors x and y, <x,y>, so that the roundoff properties are the best possible, which will lead to a more accurate answer than the standard algorithm.

- Write the details of your algorithm.
- Prove it works mathematically.
- Demonstrate that it works on an example where floating point numbers have only 2 digits of precision, i.e., all numbers are of the form ±0.ab10^c, where a, b and integers between 0 and 9 and c is an integer exponent between -9 and 9. Use x = [ 90, .99, 0.1, -0.5, -0.1 ] and y = [ 1, 1, 0.1, 0.2, -0.1 ].

- Compute <x,y> in full precision (all possible digits) using the standard algorithm.
- Compute <x,y> using only the 2 digits format using the standard algorithm.
- Compute <x,y> using only the 2 digits format using your algorithm.
- Prove Lemma CG3 from the class notes, page 62.
- Solve problems 14-20 in Chapter 9.8 of the textbook. This will require you to read chapter 9 carefully for things I have not lectured on in class.
- Write a parallel conjugate gradient C++ code using the parlib programs to solve Ax = b. You should use pieces of your own homeworks 2 and 4 to complete the code. You need working parallel inner product and matrix-vector multiplier procedures for a general matrix. Use a right hand side b of all 0's (the solution is clearly all 0's, too) and an initial guess x of all 1's. Your matrix A needs to be symmetric, positive definite. Use a very specific matrix A that is pentadiagonal: [ -1, -1, 4, -1, -1], where the 4's are are on the main diagonal, the next upper and lower diagonals off the main one, and the remaining diagonals are n off of the main diagonal, where the order of A is N = n*n (n squared). The diagonal offsets from the main diagonal for the nonzeroes of A are simply [ -n, -1, 0, 1, n].

Hints: (a) Choose n small to debug your code. (b) There are no random numbers involved in this example. (c) Download the latest version of parlib.Clarifications: Handwritten answers to 1-3 are quite acceptable, 3/10/2010.

- There are 5 forms of rounding in IEEE arithmetic. Most computers use chopping and so should you. Do not use rounding, 3/6/2010.

Do not write a parallel computing algorithm, write a mathematical algorithm, 3/10/2010.- None
- Provide enough details so that I know you are not randomly guessing,.3/4/2010.
- Provide a general nxn matrix code, not a specific n,.3/4/2010.

No preconditioning necessary, 3/9/2010.## Exam2

There are two parts to the exam, Problem 1 (programming) and the remaining problems. Problem 1 is due on 4/22/2010 at 5:00 PM MST by email in a .tgz or .zip file (no .rar files). The remainder of the problems are due in class on 4/20/2010.

Put your name on every side of every piece of paper turned in on 4/20. You may also email me the remainder of your problems (scans are acceptable).

- Write a parallel program in C++ using the parlib program to compute the divided differences table that is in the class notes. Assuming that you have
pprocesses, require that the order of the tablen >> p(e.g.,n = 10p). Transmit the partial tables back to process0so that it has the full table. Ask the user from process 0 for information for evaluating the function or its derivatives at points in your domain and compute the result. Processes1-(p-1) can be ended after transmitting their part of the table to process0. Have a separate function called f that you will use in the creation of your table.- Textbook, Section 4.4, problem 8.
- Textbook, Section 4.4, problem 13.
- Textbook, Section 3.5.1, problem 3.
- Textbook, Section 3.5.1, problem 7.
## Exam3

The exam is in two parts:

- Take home: I have installed the ADOL package for automatic differentiation on the mgser1.uwyo.edu computer in ~douglas/adolc_base. I want you to use it to demonstrate that you understand how automatic differentiation works for simple problems. Create a function that is not going to be obvious to another student and let ADOL create a new function for partial derivatives. The function should be nontrivial. This is due by email before the in class final exam and by paper at the final exam. A pdf file with everything in it is preferred (Word is also ok, but not any form of TeX whatsoever). What I see is what I grade.
- There will be a closed book exam in RH 247 at the university specified time on May 4, 1:15-3:15 PM. Only bring as many blue books and pens or pencils as you think you might need. Failure to bring enough means possible failure on the exam. The exam will cover only material from anything covered in the lectures, homeworks, or exams.