Zachodniopomorski Uniwersytet Technologiczny w Szczecinie

Administracja Centralna Uczelni - Wymiana międzynarodowa (S1)

Sylabus przedmiotu Machine Learning:

Informacje podstawowe

Kierunek studiów Wymiana międzynarodowa
Forma studiów studia stacjonarne Poziom pierwszego stopnia
Tytuł zawodowy absolwenta
Obszary studiów
Profil
Moduł
Przedmiot Machine Learning
Specjalność przedmiot wspólny
Jednostka prowadząca Katedra Metod Sztucznej Inteligencji i Matematyki Stosowanej
Nauczyciel odpowiedzialny Przemysław Klęsk <pklesk@wi.zut.edu.pl>
Inni nauczyciele
ECTS (planowane) 5,0 ECTS (formy) 5,0
Forma zaliczenia zaliczenie Język angielski
Blok obieralny Grupa obieralna

Formy dydaktyczne

Forma dydaktycznaKODSemestrGodzinyECTSWagaZaliczenie
wykładyW1 30 2,00,30zaliczenie
laboratoriaL1 30 3,00,70zaliczenie

Wymagania wstępne

KODWymaganie wstępne
W-1mathematics
W-2algorithms and data structures
W-3programming
W-4probability calculus and statistics

Cele przedmiotu

KODCel modułu/przedmiotu
C-1Developping a general understanding about data analysis and machine learning methods.
C-2Building the understanding about learning from data.
C-3Familiarization with probabilistic, tree-based, and boosted classifiers, and the related algorithms.
C-4Familiarization with rules mining and related algorithms.

Treści programowe z podziałem na formy zajęć

KODTreść programowaGodziny
laboratoria
T-L-1Programming PCA in MATLAB.2
T-L-2Programming CART trees in MATLAB.2
T-L-3Programming SVM optimization tasks (several versions) in MATLAB.2
T-L-4Programming MARS algorithm in MATLAB.2
T-L-5Programming the naive Bayes classifier (MATLAB) - for 'wine data set' (in class) and a selected data set (homework).6
T-L-6Programming the Apriori algorithm - mining association rules.4
T-L-7Programming an exhaustive generator of decision rules (for given premise length).4
T-L-8Programming the CART algorithm - building a complete tree.4
T-L-9Programming heuristics for pruning CART trees.4
30
wykłady
T-W-1Principal Component Analysis (PCA) as a method for dimensionality reduction. Review of notions: variance, covariance, correlation coefficient, covariance matrix. Minimization of projection lengths of data points onto a given direction. Derivation of PCA. Interpretation of eigenvalues and eigenvectors.4
T-W-2Decision trees - CART algorithm. Impurity functions, greedy generation of a complete tree. Pruning heuristics for decision trees (depth-based, leaves-based).4
T-W-3Support Vector Machines (SVM). Distance of data points from the decision hyperplane. Separation margin. Formulation of the SVM optimization task without and with Lagrange multipliers. Support vectors - what are they? Soft-margin SVM and related optimization tasks. SVMs with non-linear decision boundary using the kernel trick.5
T-W-4Multivariate Adaptive Regression Splines (MARS) for approximation tasks. Construction of splines. Least-squares approximation with arbitrary bases (in particular MARS splines). Learning algorithm. Similarities to CART.2
T-W-5Review of some elements of probability calculus. Derivation of Naive Bayes classifier. Remarks on computational complexity with and without the naive assumption. Bayes rule. LaPlace correction. Beta distributions.4
T-W-6Mining association rules by means of Apriori algorithm. Support and confidence measures. Finding frequent sets (induction). Rules generation mechanics. Remarks on the hashmap data structure applied for Apriori algorithm. Pareto-optimal rules. Remarks on decision rules generation.4
T-W-7Decision trees and CART algorithm. Impurity functions and their properties. Best splits as minimizers of expected impurity of children nodes. CART greedy algorithm. Tree pruning heuristics (by depth, by penalizing number of leafs). Recursions for traversing the subtrees (greedy and exhaustive).3
T-W-8Ensemble methods: bagging and boosting (meta classifiers). AdaBoost algorithm. Exponential criterion vs zero-one-loss function. Real boost algorithm.2
T-W-9Exam.2
30

Obciążenie pracą studenta - formy aktywności

KODForma aktywnościGodziny
laboratoria
A-L-1Participation in lab classes.30
A-L-2Programming homework assignments.10
A-L-3Programming homework tasks.30
A-L-4Preparation for short tests (15 min) carried out in lab classes.5
75
wykłady
A-W-1Participation in lectures.30
A-W-2Preparation for the exam.18
A-W-3Sitting for the exam.2
50

Metody nauczania / narzędzia dydaktyczne

KODMetoda nauczania / narzędzie dydaktyczne
M-1Lecture.
M-2Computer programming.

Sposoby oceny

KODSposób oceny
S-1Ocena formująca: Four short tests (15 minutes long) at the end of each topic during the lab.
S-2Ocena formująca: Four grades for the programs written as homeworks.
S-3Ocena podsumowująca: Final grade for the lab calculated as a weighted mean from partial grades: - tests (weight: 40%), - programs (weight: 60%).
S-4Ocena podsumowująca: Final grade for lectures from the test (2 h).

Zamierzone efekty uczenia się - wiedza

Zamierzone efekty uczenia sięOdniesienie do efektów kształcenia dla kierunku studiówOdniesienie do efektów zdefiniowanych dla obszaru kształceniaCel przedmiotuTreści programoweMetody nauczaniaSposób oceny
WM-WI_1-_null_W01
Student posesses an elementary knowledge on machine learning algorithms and techniques of data analysis.
C-1T-W-4, T-W-1, T-W-3, T-W-9, T-W-2, T-L-4, T-L-1, T-L-2, T-L-3M-1S-4
WM-WI_1-_null_W02
Student has an elementary knowledge on data mining algorithms and notions.
C-2, C-3, C-4T-W-9, T-L-5, T-L-6, T-L-7, T-L-8, T-L-9, T-W-5, T-W-6, T-W-7, T-W-8M-1S-4

Zamierzone efekty uczenia się - umiejętności

Zamierzone efekty uczenia sięOdniesienie do efektów kształcenia dla kierunku studiówOdniesienie do efektów zdefiniowanych dla obszaru kształceniaCel przedmiotuTreści programoweMetody nauczaniaSposób oceny
WM-WI_1-_null_U01
Student can implement (in Python or MATLAB) several machine learning algorithms and techniques.
C-1T-W-4, T-W-1, T-W-3, T-W-9, T-W-2, T-L-4, T-L-1, T-L-2, T-L-3M-2S-2
WM-WI_1-_null_U02
Student can implement (MATLAB or Python) data mining algorithms presented during lectures.
C-2, C-3, C-4T-W-9, T-L-5, T-L-6, T-L-7, T-L-8, T-L-9, T-W-5, T-W-6, T-W-7, T-W-8M-2S-2

Kryterium oceny - wiedza

Efekt uczenia sięOcenaKryterium oceny
WM-WI_1-_null_W01
Student posesses an elementary knowledge on machine learning algorithms and techniques of data analysis.
2,0
3,0Obtaining at least 50% of points in the final test.
3,5
4,0
4,5
5,0
WM-WI_1-_null_W02
Student has an elementary knowledge on data mining algorithms and notions.
2,0
3,0Obtaining at least 50% of points in the final test.
3,5
4,0
4,5
5,0

Kryterium oceny - umiejętności

Efekt uczenia sięOcenaKryterium oceny
WM-WI_1-_null_U01
Student can implement (in Python or MATLAB) several machine learning algorithms and techniques.
2,0
3,0Obtaining a positive average grade from homework programming tasks.
3,5
4,0
4,5
5,0
WM-WI_1-_null_U02
Student can implement (MATLAB or Python) data mining algorithms presented during lectures.
2,0
3,0Obtaining a positive average grade for homework programming projects.
3,5
4,0
4,5
5,0

Literatura podstawowa

  1. M. J. Zaki, W. Meira Jr, Data Mining and Analysis - Fundamental Concepts and Algorithms, Cambridge University Press, 2014
  2. M. J. Zaki, W. Meira Jr, "Data Mining and Analysis - Fundamental Concepts and Algorithms", Cambridge University Press, 2014
  3. P. Klęsk, Electronic materials for the course available at: http://wikizmsi.zut.edu.pl, 2015

Treści programowe - laboratoria

KODTreść programowaGodziny
T-L-1Programming PCA in MATLAB.2
T-L-2Programming CART trees in MATLAB.2
T-L-3Programming SVM optimization tasks (several versions) in MATLAB.2
T-L-4Programming MARS algorithm in MATLAB.2
T-L-5Programming the naive Bayes classifier (MATLAB) - for 'wine data set' (in class) and a selected data set (homework).6
T-L-6Programming the Apriori algorithm - mining association rules.4
T-L-7Programming an exhaustive generator of decision rules (for given premise length).4
T-L-8Programming the CART algorithm - building a complete tree.4
T-L-9Programming heuristics for pruning CART trees.4
30

Treści programowe - wykłady

KODTreść programowaGodziny
T-W-1Principal Component Analysis (PCA) as a method for dimensionality reduction. Review of notions: variance, covariance, correlation coefficient, covariance matrix. Minimization of projection lengths of data points onto a given direction. Derivation of PCA. Interpretation of eigenvalues and eigenvectors.4
T-W-2Decision trees - CART algorithm. Impurity functions, greedy generation of a complete tree. Pruning heuristics for decision trees (depth-based, leaves-based).4
T-W-3Support Vector Machines (SVM). Distance of data points from the decision hyperplane. Separation margin. Formulation of the SVM optimization task without and with Lagrange multipliers. Support vectors - what are they? Soft-margin SVM and related optimization tasks. SVMs with non-linear decision boundary using the kernel trick.5
T-W-4Multivariate Adaptive Regression Splines (MARS) for approximation tasks. Construction of splines. Least-squares approximation with arbitrary bases (in particular MARS splines). Learning algorithm. Similarities to CART.2
T-W-5Review of some elements of probability calculus. Derivation of Naive Bayes classifier. Remarks on computational complexity with and without the naive assumption. Bayes rule. LaPlace correction. Beta distributions.4
T-W-6Mining association rules by means of Apriori algorithm. Support and confidence measures. Finding frequent sets (induction). Rules generation mechanics. Remarks on the hashmap data structure applied for Apriori algorithm. Pareto-optimal rules. Remarks on decision rules generation.4
T-W-7Decision trees and CART algorithm. Impurity functions and their properties. Best splits as minimizers of expected impurity of children nodes. CART greedy algorithm. Tree pruning heuristics (by depth, by penalizing number of leafs). Recursions for traversing the subtrees (greedy and exhaustive).3
T-W-8Ensemble methods: bagging and boosting (meta classifiers). AdaBoost algorithm. Exponential criterion vs zero-one-loss function. Real boost algorithm.2
T-W-9Exam.2
30

Formy aktywności - laboratoria

KODForma aktywnościGodziny
A-L-1Participation in lab classes.30
A-L-2Programming homework assignments.10
A-L-3Programming homework tasks.30
A-L-4Preparation for short tests (15 min) carried out in lab classes.5
75
(*) 1 punkt ECTS, odpowiada około 30 godzinom aktywności studenta

Formy aktywności - wykłady

KODForma aktywnościGodziny
A-W-1Participation in lectures.30
A-W-2Preparation for the exam.18
A-W-3Sitting for the exam.2
50
(*) 1 punkt ECTS, odpowiada około 30 godzinom aktywności studenta
PoleKODZnaczenie kodu
Zamierzone efekty uczenia sięWM-WI_1-_null_W01Student posesses an elementary knowledge on machine learning algorithms and techniques of data analysis.
Cel przedmiotuC-1Developping a general understanding about data analysis and machine learning methods.
Treści programoweT-W-4Multivariate Adaptive Regression Splines (MARS) for approximation tasks. Construction of splines. Least-squares approximation with arbitrary bases (in particular MARS splines). Learning algorithm. Similarities to CART.
T-W-1Principal Component Analysis (PCA) as a method for dimensionality reduction. Review of notions: variance, covariance, correlation coefficient, covariance matrix. Minimization of projection lengths of data points onto a given direction. Derivation of PCA. Interpretation of eigenvalues and eigenvectors.
T-W-3Support Vector Machines (SVM). Distance of data points from the decision hyperplane. Separation margin. Formulation of the SVM optimization task without and with Lagrange multipliers. Support vectors - what are they? Soft-margin SVM and related optimization tasks. SVMs with non-linear decision boundary using the kernel trick.
T-W-9Exam.
T-W-2Decision trees - CART algorithm. Impurity functions, greedy generation of a complete tree. Pruning heuristics for decision trees (depth-based, leaves-based).
T-L-4Programming MARS algorithm in MATLAB.
T-L-1Programming PCA in MATLAB.
T-L-2Programming CART trees in MATLAB.
T-L-3Programming SVM optimization tasks (several versions) in MATLAB.
Metody nauczaniaM-1Lecture.
Sposób ocenyS-4Ocena podsumowująca: Final grade for lectures from the test (2 h).
Kryteria ocenyOcenaKryterium oceny
2,0
3,0Obtaining at least 50% of points in the final test.
3,5
4,0
4,5
5,0
PoleKODZnaczenie kodu
Zamierzone efekty uczenia sięWM-WI_1-_null_W02Student has an elementary knowledge on data mining algorithms and notions.
Cel przedmiotuC-2Building the understanding about learning from data.
C-3Familiarization with probabilistic, tree-based, and boosted classifiers, and the related algorithms.
C-4Familiarization with rules mining and related algorithms.
Treści programoweT-W-9Exam.
T-L-5Programming the naive Bayes classifier (MATLAB) - for 'wine data set' (in class) and a selected data set (homework).
T-L-6Programming the Apriori algorithm - mining association rules.
T-L-7Programming an exhaustive generator of decision rules (for given premise length).
T-L-8Programming the CART algorithm - building a complete tree.
T-L-9Programming heuristics for pruning CART trees.
T-W-5Review of some elements of probability calculus. Derivation of Naive Bayes classifier. Remarks on computational complexity with and without the naive assumption. Bayes rule. LaPlace correction. Beta distributions.
T-W-6Mining association rules by means of Apriori algorithm. Support and confidence measures. Finding frequent sets (induction). Rules generation mechanics. Remarks on the hashmap data structure applied for Apriori algorithm. Pareto-optimal rules. Remarks on decision rules generation.
T-W-7Decision trees and CART algorithm. Impurity functions and their properties. Best splits as minimizers of expected impurity of children nodes. CART greedy algorithm. Tree pruning heuristics (by depth, by penalizing number of leafs). Recursions for traversing the subtrees (greedy and exhaustive).
T-W-8Ensemble methods: bagging and boosting (meta classifiers). AdaBoost algorithm. Exponential criterion vs zero-one-loss function. Real boost algorithm.
Metody nauczaniaM-1Lecture.
Sposób ocenyS-4Ocena podsumowująca: Final grade for lectures from the test (2 h).
Kryteria ocenyOcenaKryterium oceny
2,0
3,0Obtaining at least 50% of points in the final test.
3,5
4,0
4,5
5,0
PoleKODZnaczenie kodu
Zamierzone efekty uczenia sięWM-WI_1-_null_U01Student can implement (in Python or MATLAB) several machine learning algorithms and techniques.
Cel przedmiotuC-1Developping a general understanding about data analysis and machine learning methods.
Treści programoweT-W-4Multivariate Adaptive Regression Splines (MARS) for approximation tasks. Construction of splines. Least-squares approximation with arbitrary bases (in particular MARS splines). Learning algorithm. Similarities to CART.
T-W-1Principal Component Analysis (PCA) as a method for dimensionality reduction. Review of notions: variance, covariance, correlation coefficient, covariance matrix. Minimization of projection lengths of data points onto a given direction. Derivation of PCA. Interpretation of eigenvalues and eigenvectors.
T-W-3Support Vector Machines (SVM). Distance of data points from the decision hyperplane. Separation margin. Formulation of the SVM optimization task without and with Lagrange multipliers. Support vectors - what are they? Soft-margin SVM and related optimization tasks. SVMs with non-linear decision boundary using the kernel trick.
T-W-9Exam.
T-W-2Decision trees - CART algorithm. Impurity functions, greedy generation of a complete tree. Pruning heuristics for decision trees (depth-based, leaves-based).
T-L-4Programming MARS algorithm in MATLAB.
T-L-1Programming PCA in MATLAB.
T-L-2Programming CART trees in MATLAB.
T-L-3Programming SVM optimization tasks (several versions) in MATLAB.
Metody nauczaniaM-2Computer programming.
Sposób ocenyS-2Ocena formująca: Four grades for the programs written as homeworks.
Kryteria ocenyOcenaKryterium oceny
2,0
3,0Obtaining a positive average grade from homework programming tasks.
3,5
4,0
4,5
5,0
PoleKODZnaczenie kodu
Zamierzone efekty uczenia sięWM-WI_1-_null_U02Student can implement (MATLAB or Python) data mining algorithms presented during lectures.
Cel przedmiotuC-2Building the understanding about learning from data.
C-3Familiarization with probabilistic, tree-based, and boosted classifiers, and the related algorithms.
C-4Familiarization with rules mining and related algorithms.
Treści programoweT-W-9Exam.
T-L-5Programming the naive Bayes classifier (MATLAB) - for 'wine data set' (in class) and a selected data set (homework).
T-L-6Programming the Apriori algorithm - mining association rules.
T-L-7Programming an exhaustive generator of decision rules (for given premise length).
T-L-8Programming the CART algorithm - building a complete tree.
T-L-9Programming heuristics for pruning CART trees.
T-W-5Review of some elements of probability calculus. Derivation of Naive Bayes classifier. Remarks on computational complexity with and without the naive assumption. Bayes rule. LaPlace correction. Beta distributions.
T-W-6Mining association rules by means of Apriori algorithm. Support and confidence measures. Finding frequent sets (induction). Rules generation mechanics. Remarks on the hashmap data structure applied for Apriori algorithm. Pareto-optimal rules. Remarks on decision rules generation.
T-W-7Decision trees and CART algorithm. Impurity functions and their properties. Best splits as minimizers of expected impurity of children nodes. CART greedy algorithm. Tree pruning heuristics (by depth, by penalizing number of leafs). Recursions for traversing the subtrees (greedy and exhaustive).
T-W-8Ensemble methods: bagging and boosting (meta classifiers). AdaBoost algorithm. Exponential criterion vs zero-one-loss function. Real boost algorithm.
Metody nauczaniaM-2Computer programming.
Sposób ocenyS-2Ocena formująca: Four grades for the programs written as homeworks.
Kryteria ocenyOcenaKryterium oceny
2,0
3,0Obtaining a positive average grade for homework programming projects.
3,5
4,0
4,5
5,0