Words are the building blocks of sentences, yet meaning of a sentence goes well beyond meanings of the words therein. Indeed, while we do have dictionaries for words, we don't seem to need them to infer the meaning of a sentence from meanings of its constituents. Discovering the process of meaning assignment in natural languages is one of the most foundational issues in linguistics and computer science, whose findings will increase our understanding of cognition and intelligence and may assist in applications to automating language-related tasks, such as document search as done by Google.
To date, the compositional logical and the distributional probabilistic models have provided two complementary partial solutions to the problem of meaning assigning in natural languages. The logical approach is based on classic ideas from mathematical logic, mainly Frege's principle that meaning of a sentence can be derived from the relations of the words in it. The distributional model is more recent, it can be related to Wittgenstein's philosophy of `meaning as use', whereby meanings of words can be determined from their context. The logical models have been the champions on the theory side, whereas in practice their probabilistic rivals have provided the best predictions. This two-sortedness of defining properties of meaning: `logical form' versus `contextual use', has left the question of `what is the foundational structure of meaning?' even more open a question than before. This project has ambitious and far-reaching goals; it aims to bring together these two complementary concepts to tackle the question. And it aims to do so by bridging the fields of linguistics, computer science, logic, probability theory, category theory, and even physics. Its scope is foundational, multi and inter disciplinary, with an eye towards applications.
Meaning assignment is a dynamic interactive process involving grammar and logic as well as meanings of words. Both of the two existing approaches to language miss a crucial aspect of this process: the logical model ignores meanings of words, the distributional model ignores the grammar and logic. We aim to model the entire dynamic process alongside the following three strands of integration, foundations, and applications.
(I) In integration we develop a process of meaning assignment that acts with the compositional forms of the logical model on the contextual word-meaning entities of the distributional model.
(II) In foundations, we go beyond classical logical principles of compositionality and context-based models of meaning to develop more fundamental processes of meaning assignments based on novel information-flow techniques, mainly from physics, but also from other linguistic approaches and other models of word meaning, such as ontological domains and conceptual spaces.
(III) In applications, we evaluate our theories against naturally occurring data and apply the results to practical issues based on meaning inference and similarity, e.g. in search. To be able to work with logical connectives in Google, one needs to re-enter them by hand in the `advanced search' tab, by manually decomposing the logical structure of the sentence and moreover providing the extra context for their different meanings. This is fundamentally non-compositional and goes against the spirit of automated search. It is exactly here that the lack of compositional methods in meaning assignment causes practical problems and where our compositional methods become of use. Hence, we aim to put forward our results to tackle such problems, e.g. to be able to use our sentence similarity models for paraphrasing, question-answering, and retrieving documents that have the same meaning and/or are about the same subject. Our proposed partnership with Google, ensures access to real life data and helps implementation and applicability of our methods in small and large scales.