NEW COURSE:

 

Introduction to Data Mining and Knowledge Discovery

CIS 350 – Fall 2001

 

meeting days:

Monday at 9:40A-11:30 in Tuttleman Learning Center 1A
Wednesday at 10:40A-11:30 in Tuttleman Learning Center 1A
Friday at 9:40A-11:30 in Wachman Hall 200 (LAB)

 

instructors:

First 1/3 of the course

Zoran Obradovic, 303 Wachman Hall, phone: 204-5082

The rest of the course

Slobodan Vucetic, 323 Wachman Hall, phone: 204-5773

Office Hours: Wednesday 2:00 pm - 3:00 pm, or by appointment

 

lab assistant:

Hongbo Xie, 323, Wachman Hall, hongbox@astra.ocis.temple.edu, phone: 204-5773  

Lab Page

 

Objective:

Data mining has emerged as one of the most exciting and dynamic fields in computer science. Simply stated, data mining refers to a family of techniques used to detect ‘interesting' nuggets of relationships/knowledge in data. CIS 350 is designed to provide students with a broad background in the design and use of data mining algorithms and in applying these ideas to a real life situation. Case studies will be provided using practical examples of data mining systems in e-commerce, finance, medicine, and bioinformatics. The students will be expected to develop skills for applying data mining tools to practical problems. To this goal, several tutorials on using Matlab and its toolboxes will be included in the course. The course is targeted to a broad group of students from computer science, engineering, science, business and other disciplines.

 

Prerequisites:

Elementary computer programming skills (in an arbitrary programming language). Some knowledge of statistics and linear algebra. In all other respects the course will be self-contained.

 

Required texts:

S. Weiss, N. Indurkhya, Predictive Data Mining, 1998.

R. Pratap, Getting Started with MATLAB 5, A Quick Introduction for Scientists and Engineers, 1998.
 

Course Topics:

·         Overview: (1) The appeal; (2) Application areas; (3) Methodological issues.

·         Data Mining Methods: (1) Traditional statistical methods;  (2) Neural Networks; (3) Decision Trees; (4) Clustering; (5) Association Rules.

·         Knowledge Discovery Process: (1) Exploration; (2) Model building; (3) Validation.

·         Experimental Design: (1) How to select an appropriate model; (2) How to run an experiment; (3) How to evaluate performance.

·         Case studies (depending on interest): e-commerce, web mining, finance, medicine, bioinformatics…

 

Grading:

final grade (could be adjusted):

Letter Grade

Total Points

A

> 90

B

81 – 90

C

71 – 80

D

61 – 70

F

< 61

 

Academic Honesty:

Academic honesty is taken seriously. You must write up your own solutions and code. For homework problems or projects you are allowed to discuss the problems or assignments verbally with other class members, TA, or instructor. You must acknowledge the people with whom you discussed your work. Any other sources (e.g. Web, research papers, books) used for solutions and code must also be acknowledged.