Chinese University of Petroleum, Qingdao, China, July 19-24, 2015

Professor Craig C. Douglas




Find a MapReduce system on the Internet that you can program. Install it on your computer. Look at the Sentence Problem. Write a MapReduce pair of functions to find the unique set of sentences in a dataset.

Find the 1-distance sentences. (Much harder) Can you do this part using Apache Pig?

