First phase record submission: Contents
Second phase record submission: Contents
Lab Internal: Paper
30/09/2021 ( 61-92) and 01/10/2021 (1-34)
Word count program code: Code
07/10/2021 (61-92)
Experiment-1:
Frequent itemset: an itemset X is called frequent for database 𝐷 if and only if it is contained in more than min support many transactions: support(X) >= min support
1. Download the data set from here: groceries.csv
2. Write a Map-reduce program to find 1-frequent item set with 25% min support
3. Write a Map-Reduce program to find 2-frequent item set with 23% min support
4. Write a Map-reduce program to find 3-frequent item set with 20% min support
Experiment-2:
1.Download MovieLens data set
2. Use ratings.csv file , Write a Map-Reduce Program to find the average rating of movies . Where average rating of movie X is defined as = (sum of ratings of movie X/ total number of ratings of X) (not a great formula, come up with your own)
3. Use movies.csv file, Write a Map-reduce program to find the total number of movies in each genre.
Data source of First question:
Groceries Market Basket Dataset | Kaggle