Even more details

2018-04-11 11:57:19 -05:00
parent a3186d9383
commit 6a39d8acf3
1 changed files with 23 additions and 1 deletions
--- a/concurrency-mapreduce/README.md
+++ b/concurrency-mapreduce/README.md
@@ -186,10 +186,31 @@ invoked once per key, and is passed the key along with a function that enables
 iteration over all of the values that produced that same key. To iterate, the
 code just calls `get_next()` repeatedly until a NULL value is returned;
 `get_next` returns a pointer to the value passed in by the `MR_Emit()`
-function above. 
+function above. The output, in the example, is just a count of how many times
 a given word has appeared.
 All of this computation is started off by a call to `MR_Run()` in the `main()`
 routine of the user program. This function is passed the `argv` array, and
 assumes that `argv[1]` ... `argv[n-1]` (with `argc` equal to `n`) all contain
 file names that will be passed to the mappers.
 One interesting function that you also need to pass to `MR_Run()` is the
 partitioning function. In most cases, programs will use the default function
 (`MR_DefaultHashPartition`), which should be implemented by your code. Here is
 its implementation:
 ```
 unsigned long MR_DefaultHashPartition(char *key, int num_buckets) {
    unsigned long hash = 5381;
    int c;
    while ((c = *key++) != '\0')
      hash = hash * 33 + c;
    return hash % num_buckets;
 }
 ```
 The function's role is to take a given `key` and map it to a number, from `0`
 to `num_buckets - 1`. Its use is internal to the MapReduce library; 
@@ -197,6 +218,7 @@ function above.
 ## Considerations
 - **Thread Management**. 
 - **Memory Management**. yyy.