Welcome to my homepage! I am an associate professor in the CLIC group of CIMeC, the Center for Mind/Brain Sciences of the University of Trento. I am also a member of the DISI Department. My broad research areas are computational linguistics and cognitive science. For more details about me, please visit my education and academic/professional history page.



My main current research topic is distributional semantics. I am exploring the idea that human conceptual (semantic) knowledge is, to a considerable extent, the result of the extraction of simple distributional information from large amounts of linguistic input, i.e., that we think of barking, having a tail and being a pet as salient properties of dogs because we heard/read lots of sentences/phrases such as "her dog barks all the time", "the dog wagged its tail" or "dogs and other pets".

While this sounds counter-intuitive, I am encouraged by the large body of work in corpus-based Natural Language Processing that has shown how simple pattern extraction techniques can harvest very rich and multi-faceted knowledge from raw text, well beyond what is learned by models based on other sources.

The main contribution that my colleagues and I made to this line of research has been the development of flexible distributional models that can capture and distinguish different kinds of semantic relations at once (that is, models that learn from raw text data that dog is related to both barking and to pet, but also that the first relation is more like the one between boat and floating, the second like the one of boat with vehicle). Consequently, our models can simulate human behaviour in a variety of unrelated meaning-related tasks.

Most recently, we started looking at the fundamental issue of composition of meaning within the distributional framework, that is, we are developing methods to derive the meaning of combined expressions such as pink dinosaur or many dinosaurs from the representations of the component words we extract from text with distributional techniques. The ability to construct an infinity of meanings by composition of finite units is a fundamental aspect of human language and cognition, and any computational system that attempts to simulate human behaviour, for scientific or practical purposes, must possess a similar capacity. In 2011, we were awarded a large ERC Starting Grant for the 5-year COMPOSES project on compositionality in distributional semantics, so this will be the main focus of our research for a long time.

A second important research strand that I am pursuing, in part thanks to a Google Research Award, pertains to the extension of distributional semantics to encompass distributional information coming from images that co-occur with words (for example, in Web pages). This "multimodal" approach to meaning should make distributional semantics even more human-like, since we humans acquire meaning not only from linguistic contexts, but also from perceptual cues.

My main interest, as an instructor, is to provide my students -- cognitive scientists, (computational) linguists, philosophers -- with the kind of quantitative and computational background that is often lacking in these disciplines.

My teaching activities currently take place within the Language and Multimodal Interaction track of our International Master in Cognitive Science.

Current or recent teaching activities at the University of Trento include:

I also had several chances to teach "statistics for linguists" mini-courses, mostly with Stefan Evert: see the SIGIL page.

For my complete teaching experience, please refer to my education and academic/professional history page.

Email address: marco baroni AT unitn it

Snail mail address: Marco Baroni, CIMeC (Università di Trento), Palazzo Fedrigotti, Bettini 31, 38068 Rovereto, Italy.

Phone: +39 0464-808612.

