Marco Baroni's Homepage

me
Spider and I (photo by Moto)

I am a tenured researcher in the CLIC group of CIMeC, the Center for Mind/Brain Sciences of the University of Trento, and a member of the DISCoF Department.

Research

My current research focuses on two related themes.

First, I am exploring the crazy idea that human conceptual knowledge is mostly the result of the extraction of simple distributional information from large amounts of verbal input, i.e., that we think of barking, having a tail and being a pet as salient properties of dogs because we heard/read lots of sentences/phrases like "her dog barks all the time", "the dog wagged its tail" or "dogs and other pets".

While this sounds counter-intuitive, I am encouraged by the large body of work in corpus-based Natural Language Processing that has shown how simple pattern extraction techniques can harvest very rich and multi-faceted knowledge from raw text -- way beyond what is learned by models based on visual cues, non-verbal interaction and other sources.

My most representative work in this direction is summarized here and here.

I am also interested in the applied offshoots of our research on concept induction -- in particular, I am currently the recipient of a Google Research Award to study convergences between text- and vision-based semantic spaces and of an Italian PRIN grant to model semantic cognition in the blind with corpus-based distributional models. I am also collaborating to projects on semantic fields in pedagogical lexicography, on the creation of a concept data-base for clinical work with patients with verbal and conceptual deficits and to the mighty LiveMemories project.

The second active strand of my research pertains to the creation of the electronic resources (such as corpora and lexica) and computational tools that enable the sort of quantitative computer-based simulations I describe above. My tools and resources page provides links to some stuff I (co-)developed.

Please take a look at my publications page for more topics I have been working on.

By the way: Are you on Facebook? Then please help us collecting better semantic data by playing with us!

Teaching

As a linguist, I received a "humanist" training, i.e., no math, very little programming, very basic statistics. I am still trying to fully recover from this.

Consequently, my main interest, as a teacher, is to provide my students -- (computational) linguists, philosophers, cognitive scientists -- with a decent quantitative and computational background.

The most concrete step in this direction has been setting up, together with my CLIC colleagues and the Philosophy section of the School of Humanities, a "Philosophy and Informatics" major within the BA- and MA-level Philosophy degrees. Basic information (in Italian) about these programs is available available here and here.

Current teaching activities at the University of Trento include:

I also had several chances to teach "statistics for linguists" mini-courses, mostly with Stefan Evert: see the SIGIL page.

Stuff on this site:

My email address: marco baroni AT unitn it

My snail mail address: Marco Baroni, CIMeC (Università di Trento), Palazzo Fedrigotti, C.so Bettini 31, 38068 Rovereto, Italy.