ECTS credits ECTS credits: 3
ECTS Hours Rules/Memories Hours of tutorials: 1 Expository Class: 10 Interactive Classroom: 11 Total: 22
Use languages English
Type: Ordinary subject Master’s Degree RD 1393/2007 - 822/2021
Departments: External department linked to the degrees
Areas: Área externa M.U en Intelixencia Artificial
Center Higher Technical Engineering School
Call: First Semester
Teaching: With teaching
Enrolment: Enrollable
The course introduces the student to the derivation of information and knowledge from the analysis of a collection of documents in natural language, which refers to almost all generated and stored data. The student will be trained in the analysis of content on enriched document representation models, in order to address specific applications on different domains. Special attention will be paid to the extraction of relevant information, the determination of the contextual polarity (sentiment) deducible from a content and the automatic response to questions posed directly in natural language. In short, it is a question of answering fundamental questions in the development of interfaces, decision support environments and access to new knowledge.
Document analysis: argumentative structure, coherence and co-references. Information retrieval and extraction. Sentiment analysis. Answer search and other applications of text mining.
BASIC
Berry, M. W., & Kogan, J. (Eds.). (2010). Text mining: applications and theory. John Wiley & Sons.
COMPLEMENTARY
Taeho Jo, Text Mining: Concepts, Implementation, and Big Data Challenge (Studies in Big Data Book 45), 978-3319918143, 1, Springer, 2019
Basic and basic skills:
CG1 - Maintain and expand theoretical approaches founded to enable the introduction and exploration of new and advanced technologies in the field of Artificial Intelligence.
CG3 - Look for and select useful and necessary information to solve complex problems, by making use of the bibliographic sources in the area.
CG4 - Elaborate adequately and with some originality written compositions or motivated arguments, present plans, work projects, scientific articles and formulate rational hypotheses in the field.
CB6 - Understand knowledge that provides a basis or opportunity to be original in the development and / or application of ideas, usually in a research context.
CB7 - Students are able to apply their acquired knowledge and problem-solving skills in new or unfamiliar environments within broader (or multidisciplinary) contexts related to their area of study.
Transversal competencies:
CT7 - Develop the ability to work in interdisciplinary or transdisciplinary teams, to present proposals that contribute to the sustainable development of the environmental, economic, political and social point of view.
CT8 - Appreciate the importance of research, innovation and technological development in the socio-economic and cultural progress of society.
Specific skills:
CE1 - Comprehension and mastery of lexical, syntactic and semantic processing techniques in natural languages.
CE2 - Understanding and mastery of the fundamentals and techniques of processing of linked, structured and unstructured documents, and the representation of their content.
The following teaching methodology is used:
- Presentation method/theoretical session: teachers present a topic to students with the aim of providing a set of information with a specific scope.
- Laboratory practices: the teachers of the discipline present to the students one or more practical problems that require the comprehension and application of the theoretical and practical contents included in the syllabus of the subject. Students can work on solving problems individually or as a team. These activities may require autonomous work, although guided by the teacher of the subject.
- Project-based learning: students are presented with practical projects that require an important part of their total dedication to the topic. In addition, and due to the scope of the work to be performed, it is necessary for the student to use not only management skills, but also technical skills.
- Mentoring: the teachers will attend the students in individualized mentoring sessions, dedicated to the orientation in the study and to the resolution of doubts on the contents, duties and activities of the discipline.
The Virtual Campus will be used for the distribution of materials, as well as guides and tutorials for carrying out the necessary activities.
The assesment will consist of the following parts:
E1: Final exam 25%
E2: Evaluation of laboratory work 40%
E3: Evaluation of tutored work 35%.
To pass (and release) both E2 and E3 it is required to reach 40% of the maximum score foreseen for these evaluation elements. There is no minimum required for E1.
To pass the subject it is necessary to reach the previous minimums (in E2 and E3) and to add in the final weighted grade a minimum of 5 points out of 10.
In the case of not obtaining the minimum required to pass any of the parts (E2 and E3), the student will have a second opportunity in which he/she will deliver the elements not passed.
In the case of students who pass part of the evaluated elements, but do not reach the minimum required to pass the whole subject, the grade to be included in the respective minutes will be calculated as the minimum between the weighted average of the parts passed and 4.9.
The student will have the condition of “Present” if he/she submits all the compulsory practicals and assignments or takes the objective test during the official evaluation period.
The continuous evaluation mode is assumed by default. The students who do not opt for the continuous evaluation will have to communicate it by means of the mechanisms that are enabled and in the stipulated term, once exceeded the term of one month from the beginning of the four-month period.
Homework and assignments must be done within the established deadline, and will follow the specifications indicated in the statement for both the presentation and defense.
In the case of fraudulent performance of exercises or tests, the Regulations for the evaluation of students' academic performance and review of qualifications will be applied. In application of the corresponding regulations on plagiarism, the total or partial copy of any practical or theory exercise will result in a failure in the two opportunities of the course, with a grade of 0.0 in both cases.
Second opportunity
In the case of not obtaining the minimum required to pass any of the parts (E2 and E3), the student will have a second opportunity in which he/she will only deliver the elements not passed.
Second and subsequent enrollments
The evaluation criteria will be the same as for first-time students.
Class attendance
Attendance is not compulsory, but it will help to pass the course.
The temporal distribution of the course is as follows:
Distribution of two ECTS credits:
- Theoretical sessions: 10 (on-site hours) + 21 (non-presential hours)
- Practical laboratory sessions: 5 (on-site hours) + 15 (non-presential hours)
- Problem-based learning sessions: 6 (on-site hours) + 15 (non-presential hours) 
Total: 21 (on-site hours) + 54 (non-presential hours) = Total 75 hours
It is important to acquire some basic mechanisms and automations for the use of some of the tools presented in the course. For this reason, it is recommended to repeat and extend individually at home the practices carried out in the interactive sessions.
This course is offered by the University of Vigo.
It is recommended to have taken the following subjects:
Natural Language Understanding
Language Modeling