Zachodniopomorski Uniwersytet Technologiczny w Szczecinie

Administracja Centralna Uczelni - Wymiana międzynarodowa (S1)

Sylabus przedmiotu Data Mining Algorithms:

Informacje podstawowe

Kierunek studiów Wymiana międzynarodowa
Forma studiów studia stacjonarne Poziom pierwszego stopnia
Tytuł zawodowy absolwenta
Obszary studiów
Profil
Moduł
Przedmiot Data Mining Algorithms
Specjalność przedmiot wspólny
Jednostka prowadząca Katedra Metod Sztucznej Inteligencji i Matematyki Stosowanej
Nauczyciel odpowiedzialny Przemysław Klęsk <pklesk@wi.zut.edu.pl>
Inni nauczyciele
ECTS (planowane) 4,0 ECTS (formy) 4,0
Forma zaliczenia zaliczenie Język angielski
Blok obieralny Grupa obieralna

Formy dydaktyczne

Forma dydaktycznaKODSemestrGodzinyECTSWagaZaliczenie
wykładyW1 15 1,50,30zaliczenie
laboratoriaL1 30 2,50,70zaliczenie

Wymagania wstępne

KODWymaganie wstępne
W-1mathematics
W-2programming
W-3algorithms and data structures

Cele przedmiotu

KODCel modułu/przedmiotu
C-1Building the understanding about learning from data.
C-2Familiarization with probabilistic, tree-based, and boosted classifiers, and the related algorithms.
C-3Familiarization with rules mining and related algorithms.

Treści programowe z podziałem na formy zajęć

KODTreść programowaGodziny
laboratoria
T-L-1Programming the naive Bayes classifier (MATLAB) - for 'wine data set' (in class) and a selected data set (homework).8
T-L-2Programming the Apriori algorithm - mining association rules.6
T-L-3Programming an exhaustive generator of decision rules (for given premise length).6
T-L-4Programming the CART algorithm - building a complete tree.4
T-L-5Programming heuristics for pruning CART trees.6
30
wykłady
T-W-1Review of some elements of probability calculus. Derivation of Naive Bayes classifier. Remarks on computational complexity with and without the naive assumption. Bayes rule. LaPlace correction. Beta distributions.4
T-W-2Mining association rules by means of Apriori algorithm. Support and confidence measures. Finding frequent sets (induction). Rules generation mechanics. Remarks on the hashmap data structure applied for Apriori algorithm. Pareto-optimal rules. Remarks on decision rules generation.4
T-W-3Decision trees and CART algorithm. Impurity functions and their properties. Best splits as minimizers of expected impurity of children nodes. CART greedy algorithm. Tree pruning heuristics (by depth, by penalizing number of leafs). Recursions for traversing the subtrees (greedy and exhaustive).3
T-W-4Ensemble methods: bagging and boosting (meta classifiers). AdaBoost algorithm. Exponential criterion vs zero-one-loss function. Real boost algorithm.2
T-W-5Exam.2
15

Obciążenie pracą studenta - formy aktywności

KODForma aktywnościGodziny
laboratoria
A-L-1Participation in lab classes.30
A-L-2Programming homework tasks.30
A-L-3Preparation for short tests (15 min) carried out in lab classes.15
75
wykłady
A-W-1Participation in lectures.13
A-W-2Sitting for the exam.2
A-W-3Preparation for the exam.30
45

Metody nauczania / narzędzia dydaktyczne

KODMetoda nauczania / narzędzie dydaktyczne
M-1Lectures.
M-2Computer programming.

Sposoby oceny

KODSposób oceny
S-1Ocena formująca: Four short tests (15 minutes long) at the end of each topic during the lab.
S-2Ocena formująca: Four grades for the programs written as homeworks.
S-3Ocena podsumowująca: Final grade for the lab calculated as a weighted mean from partial grades: - tests (weight: 40%), - programs (weight: 60%).
S-4Ocena podsumowująca: Final grade for lectures from the test (2 h).

Zamierzone efekty uczenia się - wiedza

Zamierzone efekty uczenia sięOdniesienie do efektów kształcenia dla kierunku studiówOdniesienie do efektów zdefiniowanych dla obszaru kształceniaCel przedmiotuTreści programoweMetody nauczaniaSposób oceny
WM-WI_1-_??_W01
Student has an elementary knowledge on data mining algorithms and notions.
C-3, C-1, C-2T-L-5, T-W-3, T-W-1, T-W-4, T-L-4, T-L-1, T-L-2, T-W-5, T-L-3, T-W-2M-1S-4

Zamierzone efekty uczenia się - umiejętności

Zamierzone efekty uczenia sięOdniesienie do efektów kształcenia dla kierunku studiówOdniesienie do efektów zdefiniowanych dla obszaru kształceniaCel przedmiotuTreści programoweMetody nauczaniaSposób oceny
WM-WI_1-_??_U01
Student can implement (MATLAB or Python) data mining algorithms presented during lectures.
C-3, C-2, C-1T-W-3, T-W-4, T-W-2, T-L-3, T-W-1, T-L-5, T-L-1, T-W-5, T-L-4, T-L-2M-2S-2

Kryterium oceny - wiedza

Efekt uczenia sięOcenaKryterium oceny
WM-WI_1-_??_W01
Student has an elementary knowledge on data mining algorithms and notions.
2,0
3,0Obtaining at least 50% of points in the final test.
3,5
4,0
4,5
5,0

Kryterium oceny - umiejętności

Efekt uczenia sięOcenaKryterium oceny
WM-WI_1-_??_U01
Student can implement (MATLAB or Python) data mining algorithms presented during lectures.
2,0
3,0Obtaining a positive average grade for homework programming projects.
3,5
4,0
4,5
5,0

Literatura podstawowa

  1. M. J. Zaki, W. Meira Jr, "Data Mining and Analysis - Fundamental Concepts and Algorithms", Cambridge University Press, 2014
  2. P. Klęsk, Electronic materials for the course available at: http://wikizmsi.zut.edu.pl, 2015

Treści programowe - laboratoria

KODTreść programowaGodziny
T-L-1Programming the naive Bayes classifier (MATLAB) - for 'wine data set' (in class) and a selected data set (homework).8
T-L-2Programming the Apriori algorithm - mining association rules.6
T-L-3Programming an exhaustive generator of decision rules (for given premise length).6
T-L-4Programming the CART algorithm - building a complete tree.4
T-L-5Programming heuristics for pruning CART trees.6
30

Treści programowe - wykłady

KODTreść programowaGodziny
T-W-1Review of some elements of probability calculus. Derivation of Naive Bayes classifier. Remarks on computational complexity with and without the naive assumption. Bayes rule. LaPlace correction. Beta distributions.4
T-W-2Mining association rules by means of Apriori algorithm. Support and confidence measures. Finding frequent sets (induction). Rules generation mechanics. Remarks on the hashmap data structure applied for Apriori algorithm. Pareto-optimal rules. Remarks on decision rules generation.4
T-W-3Decision trees and CART algorithm. Impurity functions and their properties. Best splits as minimizers of expected impurity of children nodes. CART greedy algorithm. Tree pruning heuristics (by depth, by penalizing number of leafs). Recursions for traversing the subtrees (greedy and exhaustive).3
T-W-4Ensemble methods: bagging and boosting (meta classifiers). AdaBoost algorithm. Exponential criterion vs zero-one-loss function. Real boost algorithm.2
T-W-5Exam.2
15

Formy aktywności - laboratoria

KODForma aktywnościGodziny
A-L-1Participation in lab classes.30
A-L-2Programming homework tasks.30
A-L-3Preparation for short tests (15 min) carried out in lab classes.15
75
(*) 1 punkt ECTS, odpowiada około 30 godzinom aktywności studenta

Formy aktywności - wykłady

KODForma aktywnościGodziny
A-W-1Participation in lectures.13
A-W-2Sitting for the exam.2
A-W-3Preparation for the exam.30
45
(*) 1 punkt ECTS, odpowiada około 30 godzinom aktywności studenta
PoleKODZnaczenie kodu
Zamierzone efekty uczenia sięWM-WI_1-_??_W01Student has an elementary knowledge on data mining algorithms and notions.
Cel przedmiotuC-3Familiarization with rules mining and related algorithms.
C-1Building the understanding about learning from data.
C-2Familiarization with probabilistic, tree-based, and boosted classifiers, and the related algorithms.
Treści programoweT-L-5Programming heuristics for pruning CART trees.
T-W-3Decision trees and CART algorithm. Impurity functions and their properties. Best splits as minimizers of expected impurity of children nodes. CART greedy algorithm. Tree pruning heuristics (by depth, by penalizing number of leafs). Recursions for traversing the subtrees (greedy and exhaustive).
T-W-1Review of some elements of probability calculus. Derivation of Naive Bayes classifier. Remarks on computational complexity with and without the naive assumption. Bayes rule. LaPlace correction. Beta distributions.
T-W-4Ensemble methods: bagging and boosting (meta classifiers). AdaBoost algorithm. Exponential criterion vs zero-one-loss function. Real boost algorithm.
T-L-4Programming the CART algorithm - building a complete tree.
T-L-1Programming the naive Bayes classifier (MATLAB) - for 'wine data set' (in class) and a selected data set (homework).
T-L-2Programming the Apriori algorithm - mining association rules.
T-W-5Exam.
T-L-3Programming an exhaustive generator of decision rules (for given premise length).
T-W-2Mining association rules by means of Apriori algorithm. Support and confidence measures. Finding frequent sets (induction). Rules generation mechanics. Remarks on the hashmap data structure applied for Apriori algorithm. Pareto-optimal rules. Remarks on decision rules generation.
Metody nauczaniaM-1Lectures.
Sposób ocenyS-4Ocena podsumowująca: Final grade for lectures from the test (2 h).
Kryteria ocenyOcenaKryterium oceny
2,0
3,0Obtaining at least 50% of points in the final test.
3,5
4,0
4,5
5,0
PoleKODZnaczenie kodu
Zamierzone efekty uczenia sięWM-WI_1-_??_U01Student can implement (MATLAB or Python) data mining algorithms presented during lectures.
Cel przedmiotuC-3Familiarization with rules mining and related algorithms.
C-2Familiarization with probabilistic, tree-based, and boosted classifiers, and the related algorithms.
C-1Building the understanding about learning from data.
Treści programoweT-W-3Decision trees and CART algorithm. Impurity functions and their properties. Best splits as minimizers of expected impurity of children nodes. CART greedy algorithm. Tree pruning heuristics (by depth, by penalizing number of leafs). Recursions for traversing the subtrees (greedy and exhaustive).
T-W-4Ensemble methods: bagging and boosting (meta classifiers). AdaBoost algorithm. Exponential criterion vs zero-one-loss function. Real boost algorithm.
T-W-2Mining association rules by means of Apriori algorithm. Support and confidence measures. Finding frequent sets (induction). Rules generation mechanics. Remarks on the hashmap data structure applied for Apriori algorithm. Pareto-optimal rules. Remarks on decision rules generation.
T-L-3Programming an exhaustive generator of decision rules (for given premise length).
T-W-1Review of some elements of probability calculus. Derivation of Naive Bayes classifier. Remarks on computational complexity with and without the naive assumption. Bayes rule. LaPlace correction. Beta distributions.
T-L-5Programming heuristics for pruning CART trees.
T-L-1Programming the naive Bayes classifier (MATLAB) - for 'wine data set' (in class) and a selected data set (homework).
T-W-5Exam.
T-L-4Programming the CART algorithm - building a complete tree.
T-L-2Programming the Apriori algorithm - mining association rules.
Metody nauczaniaM-2Computer programming.
Sposób ocenyS-2Ocena formująca: Four grades for the programs written as homeworks.
Kryteria ocenyOcenaKryterium oceny
2,0
3,0Obtaining a positive average grade for homework programming projects.
3,5
4,0
4,5
5,0