CS589 Machine Learning - Fall 2019Homework 1: ClassificationDue: September 30, 11:55 pmGetting Started: You should complete the assignment using your own installation of Python 3.6. Download the assignment archive from Moodle and unzip the file. This will create the directory structure as shownbelow. You will write your code under the Submission/Code directory. Make sure to put the deliverables(ex
CS589 Machine Learning - Fall 2019
Homework 1: Classification
Due: September 30, 11:55 pm
Getting Started: You should complete the assignment using your own installation of Python 3.6. Download the assignment archive from Moodle and unzip the file. This will create the directory structure as shown
below. You will write your code under the Submission/Code directory. Make sure to put the deliverables
(explained below) into the respective directories.
|-- Credit Card Transaction
If you are stuck on a question consider attending the office hours of the TA listed for that question.
Data Sets: It is important that credit card companies are able to recognize fraudulent credit card transactions so that customers are not charged for items that they did not purchase. In this assignment, you will
experiment with different classifiers on the binary classification problem of anomaly detection in credit card
transactions. The dataset described below contains transactions that occurred in a two day period. Due to
confidentiality issues, the background information about the features will not be described. You only know
that the first attribute describes the dollar amount in the transaction and the class output is either 0 for
normal or 1 for fraud.
Dataset | Training Cases | Test Cases | Dimensionality | Number of Classes
Credit Card Transaction | 200000 | 50000 | 29 | 2Deliverables: This assignment has three types of deliverables: a report, code files, and Kaggle submissions.
• Report: The solution report will give your answers to the homework questions (listed below). The
maximum length of the report is 5 pages in 11 point font, including all figures and tables. You can use
any software to create your report, but your report must be submitted in PDF format.
• Code: The second deliverable is the code that you wrote to answer the questions, which will involve
training classifiers and making predictions on held-out test data. Your code must be Python 3.6 (no
iPython notebooks, other formats or code from other versions). You may create any additional source
files to perform data analysis. However, you should aim to write your code so that it is possible
to re-produce all of your experimental results exactly by running python run me.py file from the
Submissions/Code directory. Remember to comment your code. Points will be deducted from your
assignment grade if your code is difficult to reproduce!
• Kaggle Submissions: We will use Kaggle, a machine learning competition service, to evaluate the
classifiers you create. You will need to register on Kaggle using a umass.edu email address to submit to
Kaggle, but you can choose any user name you like. You will generate test prediction files, save them