Time	Day 1: Wednesday 3rd	Time	Day 2: Thursday 4th	Time	Day 3: Friday 5th
08:00 – 08:30	Welcome desk	08:00 – 08:30		08:00 – 08:30
08:30 – 09:00	Opening plenary	08:30 – 09:00		08:30 – 09:00
09:00 – 10:00	Keynote: F. Pachet	09:00 – 10:00	Keynote: C. Schmid	09:00 – 10:00	Keynote: A. Vinciarelli
10:00 – 10:30	Coffee break	10:00 – 10:30	Coffee break	10:00 – 10:30	Coffee break
10:30 – 12:30	Oral session: SS4	10:30 – 12:30	Oral session: SS6	10:30 – 12:30	Oral session: SS7
12:30 – 14:15	Lunch break	12:30 – 14:15	Lunch break	12:30 – 14:15	Lunch break
14:15 – 15:45	Poster session: SS2 + RS2-RS3	14:15 – 15:15	Keynote: H. Trettenbrein	14:15 – 15:15	Oral session: RS1
15:45 – 16:15	Coffee break	15:15 – 15:30	Coffee break	15:15 – 15:30	Coffee break
16:15 – 17:55	Oral session: SS8	15:30 – 17:00	Poster session: EU demos + SS1	15:30 – 17:00	Poster session:SS3 + SS5
18:00 – 20:00	Jazz, Wine & Cheese reception	18:00 – 23:00	Social event: Orsay museum

RS1: Regular session – Human activity/action/gesture recognition;
RS2: Regular session – Audio and music analysis;
RS3: Regular session – Multimedia content analysis and computer vision;
SS1: Special session – 3D reconstruction, coding and transmission for audiovisual interactive services;
SS2: Special session – Automatic categorization of multimedia web data, mono-modal and multi-modal approaches;
SS3: Special session – Content Enhancement for Improved Multimedia Applications;
SS4: Special session – Informed Music Audio Processing;
SS5: Special session – Real World Sound Scene Analysis;
SS6: Special session – Semantic Media – Time-Based Navigation in Large Collections of Media Documents;
SS7: Special session – Social stance analysis;
SS8: Special session – Visual attention, a multidisciplinary topic: from behavioral studies to computer vision applications.

Detailed technical program

Wednesday 3rd

10:30 – 12:30 – Oral session – SS4

10:30 – 10:50: Real-time guitar string detection for music education software (Christian Dittmar*, Andreas Männchen and Jakob Abesser)
10:50 – 11:10: Looking beyond sound: unsupervised analysis of musician videos (Cynthia Liem*, Alessio Bazzica and Alan Hanjalic)
11:10 – 11:30: An overview of informed audio source separation (Antoine Liutkus*, Jean-Louis Durrieu, Laurent Daudet and Gaël Richard)
11:30 – 11:50: A generic classification system for multi-channel audio indexing: application to speech and music detection (Elie-Laurent Benaroya* and Geoffroy Peeters)
11:50 – 12:10: Freischütz digital: a multimodal scenario for informed music processing (Meinard Mueller*, Thomas Prätzlich, Benjamin Bohl and Joachim Veit)
12:10 – 12:30: Query-by-example retrieval of sound events using an integrated similarity measure of content and label (Annamaria Mesaros*, Toni Heittola and Kalle Palomäki)

14:15 – 15:45 – Poster session – SS2 + RS2 and RS3

A LDA-based method for automatic tagging of youtube videos (Mohamed Morchid* and Georges Linarès)
Searching segments of interest in single story web-videos (Mickael Rouvier, Georges Linarès*, Benoit Favre and Bernard Merialdo)
Introducing Motion Information in Dense Feature Classifiers (Claudiu Tanase* and Bernard Merialdo)
Fusion methods for multimodal indexing of web data (Usman Niaz* and Bernard Merialdo)
Exploring intra-bow statistics for improving visual categorization (Usman Niaz* and Bernard Merialdo)
Exploring new features for music classification (Rémi Foucard*, Slim Essid, Gaël Richard and Mathieu Lagrange)
A heuristic for distance fusion in cover song identification (Alessio Degani*, Marco Dalai, Riccardo Leonardi and Pierangelo Migliorati)
Ultra-low latency audio coding based on DPCM and block companding (Gediminas Simkus*, Martin Holters and Udo Zölzer)
Infrared ship target segmentation based on region and shape features (Zhaoying Liu, Fugen Zhou and Xiangzhi Bai*)
Large-scale Semi-supervised Learning by Approximate Laplacian Eigenmaps, VLAD and Pyramids (Eleni Mantziou*, Symeon Papadopoulos,Yiannis Kompatsiaris)
Identification of moving objects in visual surveillance data (Jogile Kuklyte*, Kevin Mc Guinness, Ramya Hebbalaguppe, Cem Direkoglu, Leonardo Gualano and Noel O’Connor)
Vision-based maritime serveillance system using fused visual attention maps and online adaptable tracker (Konstantinos Makantasis*, Anastasios Doulamis and Nikolaos Doulamis)
Densely sampled local visual features on 3D mesh for retrieval (Yuya Ohishi and Ryutarou Ohbuchi*)

16:15 – 17:55 – Oral session – SS8

16:15 – 16:35: An application framework for implicit sentment human-centered tagging using attributed affect (Konstantinos Apostolakis* and Petros Daras)
16:35 – 16:55: Superpixel-based saliency detection (Zhi Liu*, Olivier Le Meur and Shuhua Luo )
16:55 – 17:15: Sample Specific Late Fusion for Saliency Detection (Jie Sun and Congyan Lang*)
17:15 – 17:35: Affine invariant salient patch descriptors for image retrieval (Furkan Isikdogan* and Albert Salah)
17:35 – 17:55: Toward the introduction of auditory information in dynamic visual attention models (Antoine Coutrot* and Nathalie Guyader)

Thursday 4th

10:30 – 12:30 – Oral session – SS6

10:30 – 10:50: Event-driven Retrieval in Collaborative Photo Collections (Markus Brenner* and Ebroul Izquierdo)
10:50 – 11:10: Recent advances in affective and semantic media applications at the BBC (Jana Eggink* and Yves Raimond)
11:10 – 11:30: Hello Cleveland! Linked Data Publication of Live Music Archives (Sean Bechhofer*, Kevin Page and David De Roure)
11:30 – 11:50: Semi-Automated Video Logging by Incremental and Transfer Learning (Jongdae Kim and John Collomosse*)
11:50 – 12:10: Challenges of Finding Aesthetically Pleasing Images (João Faria, Stanislav Bagley*, Stefan Rüger and Toby Breckon)
12:10 – 12:30: Describing audio production workflows on the Semantic Web (Gyorgy Fazekas* and Mark Sandler (QMUL) )

15:30 – 17:00 – Poster session – SS1 + EU project demos

SS1

Sound field reproduction for consumer and professional audio applications (Etienne Corteel* and Khoa-Van Nguyen)
Blending real with virtual in 3Dlife (Konstantinos Apostolakis*, Dimitrios Alexiadis, Petros Daras, David Monaghan, Noel O’Connor, Benjamin Prestele, Peter Eisert, Gaël Richard, Qianni Zhang, Ebroul Izquierdo, Maher Ben Moussa and Nadia Magnenat)
A concise survey for 3D econstruction of building facades (Patrycia Klavdianos*, Qianni Zhang and Ebroul Izquierdo)

EU project demos

ALICE – Assistance for better mobility and improved cognition of elderly blind and visually impaired (Titus Zaharia)
AXES – The AXES pro video search system (Kevin McGuinness)
QUAERO – Audio oriented annotation of audiovisual content: a professional prototype (Félicien Vallet)
QUAERO II (Gregory Grefenstette)
REVERIE – Real and virtual engagement in realistic immersive environments (Noel O’Connor)
REWIND (Christian Dittmar)
SAVASA – Standards-based approach to video archive search and analysis (Suzanne Little)
SOCIALSENSOR – Sensing user generated input for improved media discovery and experience (Nikos Sarris)
TOSCA-MP – Task-oriented search and content annotation for media production (Werner Bailer)
VENTURI – ImmersiVe ENhancemenT of User-woRld Interactions (Paul Chippendale)

Friday 5th

10:30 – 12:30 – Oral session – SS7

10:30 – 10:50: The expressivity of turn-taking: understanding children pragmatics by hybrid classifiers (Cristina Segalin*, Anna Pesarin, Alessandro Vinciarelli and Marco Cristani)
10:50 – 11:10: Group detection in still images by F-formation modeling: a comparative study (Francesco Setti*, Marco Cristani and Hayley Hung)
11:10 – 11:30: Likability of human voices: A feature analysis and a neural network regression approach to automatic likability estimation (Florian Eyben*, Felix Weninger, Erik Marchi and Bjorn Schuller)
11:30 – 11:50: Getting rid of pain-related behaviour to improve social and self perception: A Technology-Based Perspective (Min Aung, Bernardino Romera-Paredes, Aneesha Singh, Soo Ling Lim, Natalie Kanakam, Amanda Williams and Nadia Bianchi-Berthouze*)
11:50 – 12:10: Social stances by virtual smiles (Magalie Ochs*, Catherine Pélachaud and Ken Prepin)
12:10 – 12:30: Automatic Recognition of Personality and Conflict Handling Style in Mobile Phone Conversations (Alessandro Vinciarelli*, Hugues Salamin and Anna Polychroniou)

14:15 – 15:15 – Oral session – RS1

14:15 – 14:35: Real-Time Head Nod and Shake Detection for Continuous Human Affect Recognition (Haolin Wei*, Patricia Scanlon, Yingbo Li, David Monaghan and Noel O’Connor)
14:35 – 14:55: Tapped delay multiclass support vector machines for industrial workflow recognition (Eftychios Protopapadakis*, Anastasios Doulamis and Nikolaos Doulamis)
14:55 – 15:15: Multimodal classification of dance movements using body joint trajectories and step sounds (Aymeric Masurelle*, Slim Essid and Gaël Richard)

15:30 – 17:00 – Poster session – SS3 + SS5

JPEG backward compatible format for 3D content representation (Philippe Hanhart*, Pavel Korshunov, Martin Rerabek and Touradj Ebrahimi)
On coding and resampling of video in 4:2:2 chroma format for cascaded coding applications (Andrea Gabriellini and Marta Mrak*)
Optimized tone mapping with flickering constraint for backward-compatible high dynamic range video coding (Alper Koz* and Frederic Dufaux)
Versatile layered depth video coding based on distributed video coding (Giovanni Petrazzuoli, Corina Macovei, Irina-Emilia Nicolae, Marco Cagnazzo*, Frédéric Dufaux and Béatrice Pesquet)
Acoustic recursive bayesian estimation for non-field-of-view targets (Makoto Kumon* and Tomonari Furukawa)
Robust Localization and Tracking of Multiple Speakers in Real Environments for Binaural Robot Audition (Ui-Hyun Kim* and Hiroshi Okuno)
Robust Spectro-Temporal Speech Features with Model-Based Distribution Equalization (Samuel Kevin Ngouoko Mboungueng*, Martin Heckmann and Britta Wrede)
A Nested Infinite Gaussian Mixture Model for Recognizing Known and Unknown Audio Events (Yoko Sasaki*, Kazuyoshi Yoshii and Satoshi Kagami)
Saliency-based modeling of acoustic scenes using sparse non-negative matrix factorization (Benjamin Cauchi, Mathieu Lagrange*, Nicolas Misdariis and Arshia Cont)
Footstep Detection and Classification Using Distributed Microphones (Kazuhiro Nakadai*, Yuta Fujii and Shigeki Sugano)

Technical program