|
Week - 1 |
Introduction to data mining; overview of data acquisition and preprocessing; essential components and application areas of data mining. |
|
Week - 2 |
Data objects; attribute types; basic descriptive statistics. |
|
Week - 3 |
Distribution measures; visualization techniques; graph-based analysis. |
|
Week - 4 |
Similarity and dissimilarity measures; distance metrics; computing data similarity. |
|
Week - 5 |
Data cleaning; missing data handling; noise reduction; resolving inconsistencies. |
|
Week - 6 |
Data integration; data reduction; sampling; normalization and discretization. |
|
Week - 7 |
Frequent pattern concepts; support, confidence, and association rules. |
|
Week - 8 |
Basics of classification; training/test sets; types of classification errors. |
|
Week - 9 |
Decision trees; information gain; Gini index; model evaluation. |
|
Week - 10 |
Naive Bayes; Bayesian concepts; rule-based classification. |
|
Week - 11 |
Clustering concepts; clustering types; similarity-based grouping. |
|
Week - 12 |
K-means; hierarchical clustering; density-based methods. |
|
Week - 13 |
Outlier concept; distance-based, density-based, and model-based outlier detection. |
|
Week - 14 |
End-to-end data processing pipeline design integrating the concepts learned throughout the semester, including preprocessing, frequent pattern mining, classification, clustering, and outlier detection. Summary application on a real dataset, model selection, interpretation of results, and comparative evaluation. |