Hadoop/Big Data Projects
Fall 2014


The project should use 3 to 5 of the following technologies:

Project Examples



Throughout the semester, certain milestones will be set to ensure success prior to the project defense.

  Due Date Description
1 Oct 6, 2014 An approximately one page description of the project you will be delivering at the end of the semester, including data sources, what will be delivered, and the technologies that will be used.
2 Nov 7, 2014 An alpha release of your project. Please email to me a one to two page status report containing the following items:

  • Project Overview
    • Refine what you had delivered previously
  • Data Architecture Diagram
    • Projects used, data flow
  • List of Data Sources you are using
    • Where it came from, features/fields you are interested in
  • Description of Deliverables
    • Final data sets, web application, etc.
  • What you have completed so far
    • Data is ingested, MapReduce is running, web app is partially functional, algorithms are there but need fine-tuning, etc.
  • Risks
    • What things may prevent you from completing the project by the due date.

In addition to this status report, please email any source code or other deliverables you have so far. I do not want your data! It is probably pretty big.

3 Dec 8, 2014 An approximately five page report containing the below items. It is the continued evolution of your project reports. Pretend like I don't know anything about Hadoop, as well.

  • Project Abstract
    • Briefly describe the problem you are addressing, the technologies you are using, and solution.
  • Problem!
    • Introduce the problem you are addressing and the technologies you are using (with brief descriptions of what the technology is). Include a data architecture diagram, including the data sources and solution.
  • Data Sets and Transformations
    • Where the original data came from and the features/fields you are using. Describe how the data is transformed and used in your data architecture, and the final results (or web page or whatever). Be specific.
  • Implementation
    • How was your project implemented? What problems or issues did you encounter and how did you over come them? Include any changes you had made from when you started to now.
  • Results
    • Describe your final results, any interesting things you had found. Include data visualization pieces where necessary, descriptions of web visualizations, describe how a user may interact with your system, screenshots of visualizations. Go nuts.
  • Future Work
    • Describe what could be done in the future that would make you all happier about the project. Maybe a different technology should have been chosen. Maybe you want to expand on what you have and include other technologies to do something sweet.

In addition, prepare a 20 minute presentation for your project defense consisting of more-or-less the contents of your paper. Include any demos as you see fit, and leave time for questions.

In addition to the report and presentation, please email any source code or other deliverables you have so far. I still do not want your data! It is still probably pretty big.