The present infrastructure of ILSP comprises a modern, proprietary 2.727 m² building in Athens equipped with intranet & structured cabling, high-speed internet connection through GRNET , central computing facilities organized in 2 computer rooms, an in-house library, and a specialized Multimedia Recording Studio. ILSP also possesses a proprietary modern building in Xanthi (region of Western Thrace in northern Greece) where its regional branch resides.
To assist the participating research groups in pursuing novel approaches in the priority research areas, an investment was made for upgrading and enhancing ILSP’s equipment and technical infrastructure, enabling it to address the new multimodal and multifaceted requirements of modern language technologies and to support it growth and networking plans.
Acquiring the capability to capture, analyse, organize and exploit such information is an indispensable component of the growth strategy of any organization active in the area of language technology. Therefore, an investment was deemed necessary in order to acquire sophisticated capturing equipment that allows capturing data and collecting information on multimodal human to human interaction in naturalistic settings and mundane tasks such as multisensory data recordings comprising speech recordings, video recordings, full-body motion capture measurements, eye tracking, object tracking, etc. Such unique multisensory recordings have enabled the study of language and communication in relation to human sensory-motor interaction with the physical environment and with others.
The equipment acquired falls into the following categories:
This equipment is beeing used for capturing and measuring non-verbal features (gestures, facial expressions, body movements etc), which significantly contribute to human communication, and to correlate them to features extracted through text, speech, and vision analysis techniques. It has also allow ILSP to extend its work to special target groups with various disabilities as well as to young children. The existing studio has been reinforced with additional capturing equipment and an adjacent room was used for cognitive experiments. Auxiliary, portable equipment facilitated data collection in the field, permitting the collection of data on the field in realistic settings.
The priority research area of multimodal communication directly benefited from such equipment. Capturing and modelling significant points of interest on a speaker’s face in parallel to audio recordings provided the resources necessary for building realistic multimodal synthesizers. This equipment also provided significant support for other existing, strong research activities of ILSP, such as sign language synthesis, analysis and teaching, where multimodality (facial expressions, hand and head movements, gestures) played a critical role.
This equipment was also used in the correlation of the captured physiological reactions to certain language expressions of sentiments and emotions, preferably in naturalistic settings. Such unique multisensory recordings allowed the study of language in relation to human sensory-motor interaction with the world and with others.
At the other end of capturing multimodal communication patterns is the research topic of generating them. Embodied conversational agents are a form of intelligent user interface. They aim to unite gesture, facial expression, and speech to enable face-to-face communication with users: a powerful means of human-computer interaction. Face-to-face communication allows communication protocols that give a much richer communication channel than other means of communicating. It enables pragmatic communication acts such as conversational turn-taking, facial expression of emotions, information structure and emphasis, visualisation and iconic gestures, and orientation in a three-dimensional environment. This communication takes place through both verbal and non-verbal channels such as gaze, gesture, spoken intonation and body posture. Embodied agents also provide a social dimension to the interaction. Humans willingly ascribe social awareness to computers, and thus interaction with embodied agents follows social conventions, similar to human/human interactions.
Related priority research areas: 1 – Multimodal communication
Digitization is the key to heritage preservation. Scanning and processing rare books, collections and national archives have received particular attention in the era of digital libraries and the Europeana. Such a system significantly increased ILSP’s capacity to undertake and carry out R&D projects in the area of cultural heritage with emphasis of handwriting recognition of historical manuscripts. The added research capacity that such equipment was estimated to provide relates to:
Related priority research areas: 2 – Multimedia processing
It aimed to be used by research teams in day-to-day R&D activities. This was used to process and exploit the new streams of the multimodal data captured by the newly acquired equipment.