Stage 0: Form Team

Lokananda Dhage

Mary Feng

Varun Naik

Stage 1: Problem Definition

We are planning to analyze information about restaurants in the Madison, WI area. We obtained data from the Zomato API and the Yelp dataset challenge. Each Yelp review and Zomato review will be one of our text documents for Stage 2.

Stage 2: Information Extraction

We performed information extraction on 300 randomly selected Yelp reviews.

Stage 3: Entity Matching

Since our Yelp/Zomato dataset had fewer than 3,000 tuples in each table, we switched to a different dataset for this stage of the project. We performed entity matching between a Song table with 961,593 tuples, and a Track table with 734,485 tuples.

Stage 4: Data Merging

We returned to our Yelp/Zomato dataset. We combined the two into a single dataset in CSV format.

Stage 5: Data Analysis

We performed correlation discovery on our merged Yelp/Zomato file.