ITC
:: Linguistica generale e computazionale / General and Computational
Linguistics
Computational Linguistics: Provisional Course Outline
Objectives
- Understand what computational linguistics can and cannot do
- Understand fundamentals, to allow you to work together with
computational linguistics
- (Optional) Learn programming and some practical corpus linguistics
- Option 1: written exam only (50% computational, 50% general)
- Option 2: 25% by continuous assessment (weekly computational exercises); 25% by
written computational exam; 50% by written general exam
Students can choose to be assessed exclusively on the basis of the
exam (first option), or by giving equal weight to exercises and the
exam.
The deadline for individual exercises will be decided on each week. The
deadline date for submission of the final exercise is Friday 27th
of May. After this date students can email me directly to get a
summary of their marks. See end of page for description of the exam.
Questions
- Please send me your email contact
- Any special interests, requests?
- What is your background
Lecturers
Materials & References
- The materials for each lecture will all be linked to from these
webapges
- The basic reference for the Python language and general
principles of programming will
be the Italian language, Tutorial
per principianti in Python, which is also available as a PDF file
if you would prefer to print it out (if you have trouble opening this
file, you can try here
also).
- Two other good references, which assume background knowledge of
programming, are the very short Python
Istantaneo and the much longer and more comprehensive Dive Into Python
- For the main content on computational linguistics, we will be
using
the Natural Language Toolkit (NLTK), an add-on module for Python. The
associated book is available
free on the web, or you can buy the paper version for
about €30
The Exam
- The exam will be in written format, in Italian, using pen and paper, without being able to consult
reference books.
- The questions will cover these topics:
- Basic structure and function of a computer
- Principles of common computational linguistics applications and
resources
- Basic principles of programming:
- operators (+, *, / etc)
- keyboard input
- control structures (e.g. if/else;
for/while loops)
- logical conditions
- lists, dictionaries and sorting
- modules and functions
- Representation of text in computer memory and variables
- Basic corpus statistics: token/type distinction; type
frequency; lexical diversity; unigrams and bigrams
- Pattern matching and substitution (regular expressions)
- The
format of questions will be technical - either questions that test your
understanding of a concept, or that ask you to write a short sample of
code to solve a problem. In programming questions, marks will be given
for solutions that show you understand how to solve the problem -
trivial mistakes of syntax will be ignored.
- What materials to study: