PA164 Project
Task 1 : How text representation and outlier filtering influence classifier accuracy
Evaluation of Task 1
a report in English including
- a dataset and task description, data preparation (e.g. transformation into form)
- description of the method used including
analysis of a learning curve (x=num. of attributes, y=accuracy)
short description of pre-processing (PP), outlier detection (OD)
and learning algorithms (LA)
- results
- comparison of different combinations of {PP,OD}+LA including
graphs and statistical tests
- discussion including discussion of the outliers found
conclusion
- link to all data and results
Task 2 : 20 p. Deadline: February 4
Advanced pre-processing (via a NLP tool) + text categorisation
(exceptionally some other mining task)
in groups of 3..4, with the same datasets as in Task 1
a short single report per group
if excellent, with a report 10-12 pages, may replace a final exam.