NEW COURSE:
Introduction to Data Mining and Knowledge Discovery
CIS 350 – Fall 2001
meeting
days:
| Monday at 9:40A-11:30 in Tuttleman Learning Center 1A |
| Wednesday at 10:40A-11:30 in Tuttleman Learning Center 1A |
| Friday at 9:40A-11:30 in Wachman Hall 200 (LAB) |
instructors:
|
First 1/3 of the course Zoran
Obradovic, 303 Wachman Hall, phone: 204-5082 |
|
The rest of the course Slobodan
Vucetic, 323 Wachman Hall, phone: 204-5773 Office Hours: Wednesday 2:00 pm - 3:00 pm, or by appointment |
lab
assistant:
Hongbo
Xie, 323, Wachman Hall, hongbox@astra.ocis.temple.edu,
phone: 204-5773
Objective:
Data mining has emerged as one of the most exciting
and dynamic fields in computer science. Simply stated, data mining refers to a
family of techniques used to detect ‘interesting' nuggets of
relationships/knowledge in data. CIS 350 is designed to provide students with a
broad background in the design and use of data mining algorithms and in applying
these ideas to a real life situation. Case studies will be provided using
practical examples of data mining systems in e-commerce, finance, medicine, and
bioinformatics. The students will be expected to develop skills for applying
data mining tools to practical problems. To this goal, several tutorials on
using Matlab and its toolboxes will be included in the course. The course is
targeted to a broad group of students from computer science, engineering,
science, business and other disciplines.
Prerequisites:
Elementary computer programming skills (in an
arbitrary programming language). Some knowledge of statistics and linear
algebra. In all other respects the course will be self-contained.
Required
texts:
S. Weiss, N. Indurkhya, Predictive Data
Mining, 1998.
R. Pratap, Getting Started with MATLAB 5, A
Quick Introduction for Scientists and Engineers, 1998.
Course
Topics:
·
Overview: (1) The appeal; (2) Application areas; (3) Methodological issues.
·
Data Mining Methods: (1) Traditional statistical methods; (2) Neural Networks; (3)
Decision Trees; (4) Clustering; (5) Association Rules.
·
Knowledge Discovery Process: (1) Exploration; (2) Model building; (3)
Validation.
·
Experimental Design: (1) How to select an appropriate model; (2) How to run an experiment;
(3) How to evaluate performance.
·
Case studies (depending on interest): e-commerce, web mining, finance, medicine,
bioinformatics…
Grading:
One Midterm (15%),
Lab Quizzes (15),
Final Exam (20%),
Homework (25%),
Two Projects (25%)
final
grade (could be adjusted):
|
Letter
Grade |
Total
Points |
|
A |
>
90 |
|
B |
81
– 90 |
|
C |
71
– 80 |
|
D |
61
– 70 |
|
F |
<
61 |
Academic
Honesty:
Academic
honesty is taken seriously. You must write up your own solutions and code. For homework problems or projects you are allowed to
discuss the problems or assignments verbally with other class members, TA, or
instructor. You must acknowledge the people with whom you discussed your work. Any
other sources (e.g. Web, research papers, books) used for solutions and code must also be acknowledged.