TIGER & BDDA Seminar

TIGER & BDDA Seminar

This is a discussion group for Information retrieval GEneral Reading Group (TIGER) and the Big Data and Data Analytics (BDDA) Research Group.
We meet every Thursday, from 12:30-1:30pm. We occasionally have seminars and events at other times as well.
Want to present something? Students, staff, and interested parties outside the university are welcome to attend. We encourage all members to suggest new research topics, guide, and participate in the weekly discussions. For inquiries about joining TIGER or guiding a discussion, please contact Xiaolu.

Upcoming Meetings

Thursday 11 January
Location: Building 80 Level 09 Room 08
Presenter: Miss Chen Xi
Paper discussion: Event Early Embedding: Predicting Event Volume Dynamics at Early Stage

The paper is:

Liu, Zhiwei, et al. “Event Early Embedding: Predicting Event Volume Dynamics at Early Stage” Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 2017.

Social media has become one of the most credible sources for delivering messages, breaking news, as well as events. Predicting the future dynamics of an event at a very early stage is significantly valuable, e.g, helping company anticipate marketing trends before the event becomes mature. However, this prediction is non-trivial because a) social events always stay with “noise” under the same topic and b) the information obtained at its early stage is too sparse and limited to support an accurate prediction. In order to overcome these two problems, in this paper, we design an event early embedding model (EEEM) that can 1) extract social events from noise, 2) find the previous similar events, and 3) predict future dynamics of a new event. Extensive experiments conducted on a large-scale dataset of Twitter data demonstrate the capacity of our model on extract events and the promising performance of prediction by considering both volume information as well as content information.

Past Meetings

Meetings in 2017 Q4
(Jump to: 17-Q417-Q317-Q217-Q116-Q416-Q3.)

Thursday 14 December
Yipeng Zhou (University of South Australia)
[TIGER] Intelligent Online Video Services

Online video streaming is becoming the dominate way for the consumption of Internet content. Thanks to the thriving of Internet video market, both video population and user population grow explosively in the past few years bringing challenges for online video providers (OVP). To avoid overwhelming users with astronomical video population, It is crucial for OVPs to proactively and intelligently serve users by locating videos of their interest. Log systems and online social networks (OSN) provide OVPs plentiful data to learn user interests by recording users’ viewing, commenting, rating and sharing behaviors. In this talk, three works are presented with the aim to build intelligent online video services. Specially, malicious users generating fake views to selfishly promote popularity for particular videos are identified via TSVM classifier. Context-aware recommendation is developed by involving the knowledge of user locations inferred from access points. Epidemic models are created to analyze and quantify how different video recommendation mechanisms affect video information diffusion processes by mining view count traces.

Yipeng Zhou is a research fellow with the Institute for Telecommunications Research with University of South Australia. He was an assistant professor with the College of Computer Science and Software Engineering at Shenzhen University from Sep. 2013 to Aug. 2016. He received his B.S. degree from the department of Computer Science at the University of Science and Technology of China (USTC), M.Phil. and Ph.D. degrees from the department of Information Engineering at The Chinese University of Hong Kong (CUHK) respectively. From 2012 to 2013, he was a postdoctoral research fellow with the Institute of Network Coding of The Chinese University of Hong Kong. His research interests mainly lie in user behavior analysis, video recommendation, information diffusion, reputation systems and edge computing.

12.10.02, 12.30 – 13:30
Thursday 7 December
Avinesh P.V.S. (UKP labs, TU Darmstadt, Germany)
[TIGER] Joint Optimization of User-desired Content in Multi-Document Summarization by Learning from User Feedback

In this talk, we propose an extractive multi-document summarization (MDS) system using joint optimization and active learning for content selection grounded in user feedback. Our method interactively obtains user feedback to gradually improve the results of a state-of-the-art integer linear programming (ILP) framework for MDS. Our methods complement fully automatic methods in producing high-quality summaries with a minimum number of iterations and feedbacks. We conduct multiple simulation-based experiments and analyze the effect of feedback-based concept selection in the ILP setup in order to maximize the user-desired content in the summary.

Avinesh P.V.S. is a 3rd-year doctoral student as a part of AIPHES training group, UKP labs, TU Darmstadt, Germany. He is working on Interactive Personalized Summarization to use the human in the loop in the process of creating a summary. His main interests are personalization, summarization, natural language understanding and information retrieval.

He obtained B.Tech and MS by Research from IIIT-Hyderabad, India, where he worked on building NLP tools for Indian languages and master’s thesis on “Transfer Grammar Engine and Automatic Learning of Reorder rules in MT” with Prof. Rajeev Sangal and Prof. Dipti Misra Sharma.

After masters, Avinesh worked with Lexical Computing Ltd and IBM Watson, India, where he worked on wide range of products from corpus lexicography, Machine Translation to NLP modeling for Oncology Expert Advisor.

Website: www.avineshpvs.com

12.10.02, 12.30 – 13:30
Thursday 30 November
Eliezer de Souza da Silva (Norwegian University of Science and Technology)
[TIGER] Probabilistic Non-Negative Latent Factor Models for Spatio-Temporal Data and Recommender Systems

In this talk, we will introduce and discuss Poisson factorization models and their application for recommendations. We will start by introducing Poisson Matrix Factorization, showing the advantages of using a Poisson likelihood for implicit count data, and then proceed into introducing models that include other contextual information, social, topic models and spatio-temporal data. The first part will be about existing models: Poisson matrix factorization (PF), social Poisson factorization (SPF), collaborative topic Poisson factorization (CTPF) and Poisson matrix factorization with content and social information (PoissonMF-CS) (ECML-PKDD, 2017); The second part will discuss some possible extensions with dealing with spatio-temporal data: Poisson Tensor Factorization, State-Space models and (briefly) Point Processes (Poisson and Hawkes).

Eliezer de Souza da Silva, PhD Research Fellow of the Data and Artificial Intelligence Group, at the Department of Computer Science, Norwegian University of Science and Technology. His doctoral research is focused on Bayesian modelling and scalable (approximate) inference algorithms for personalization problems (Collaborative Filtering, Link Prediction). His research encompass joint probabilistic modelling of documents content, user information, spatio-temporal data and implicit/explicit feedback.

He obtained a M.Sc. degree in Computer Engineering from the School of Electrical and Computer Engineering, State University of Campinas with a dissertation about metric space indexing for nearest neighbor search in multimedia context, in particular it consisted in generalizing locality sensitive-hashing for metric spaces with application in kNN search.

He obtained a B.Sc. in Computer Engineering at the Federal University of Espirito Santo, with a dissertation on content-based text retrieval using Latent Semantic Indexing (LSI) and Vector Space Model.

80.09.09, 12.30 – 13:30
Thursday 23 November
Farhana Choudhury
[BDDA] Top-m Rank Aggregation of Streaming Queries on Spatial Data

As a result of the increasing popularity of GPS enabled mobile devices, the volume of content associated with both a geographic location and a text description is growing rapidly on the web. The availability of this data enables us to answer many real-life spatial-textual queries for different applications. This research addresses the increasing demand of answering real-life queries that are useful, especially in business and marketing applications. In addition, the need to improve the performance while processing a large number of queries together arising in many problems. Specifically, the following queries will be discussed: (i) Batch processing of top-k spatial-textual queries; (ii) Optimal location and keyword selection queries; and (iii) Top-m rank aggregation on streaming spatial queries.

12.10.02, 12.30 – 13:30
Thursday 16 November
Joel MacKenzie
[TIGER] Managing Tail Latencies in Large Scale IR Systems

Abstract: With the growing popularity of the world-wide-web and the increasing accessibility of smart devices, data is being generated at a faster rate than ever before. This presents scalability challenges to web-scale search systems – how can we efficiently index, store and retrieve such a vast amount of data? A large amount of prior research has attempted to address many facets of this question, with the invention of a range of efficient index storage and retrieval frameworks that are able to efficiently answer most queries. However, the current literature generally focuses on improving the mean or median query processing time in a given system. In the proposed PhD project, we focus on improving the efficiency of high percentile tail latencies in large scale IR systems while minimising end-to-end effectiveness loss. This talk will provide the motivation and overview of the specific research questions, the progress made on each, and brief look at the future work.

12.10.02, 12.30 – 13:30
Thursday 9 November
Shuai Zhang (University of New South Wales)
[TIGER] Managing Tail Latencies in Large Scale IR Systems

With the explosive growth of online information, recommender system has been an effective tool to overcome information overload and promote sales. In recent years, deep learning’s revolutionary advances in speech recognition, image analysis and natural language processing have gained significant attention. Meanwhile, recent studies also demonstrate its efficacy in coping with information retrieval and recommendation tasks. Applying deep learning techniques into recommender system has been gaining momentum due to its state-of-the-art performance. In this talk, I will present recent development of deep learning based recommender models and highlight some future challenges and open issues of this research field.

Shuai Zhang is a PhD student at the School of Computer Science and Engineering, University of New South Wales (UNSW), as well as at Data61, CSIRO, since Oct. 2016. His major research interests lie in the field of recommender systems, deep learning and internet of things. He has published articles at SIGIR and ICONIP on recommender systems. He received a Bachelor degree in 2014 from the School of Information Management, Nanjing University (NJU), and majored in Information Management and Information System. He then worked as a software engineer at OPPO for two years before coming to UNSW.

12.10.02, 12.30 – 13:30
Thursday 2 November
Jaewon Kim
[TIGER] Understanding search behaviour on mobile devices

Web search on hand-held devices has become enormously common and popular. Although a number of works have revealed how users interact with search engine result pages (SERPs) on desktop monitors, there are still a few studies related to user interaction in mobile web search, and search results are shown in a similar way whether on a mobile phone or a desktop. Therefore, it is still difficult to know what happens in between users and SERPs while searching on the small screens, and this means that the current SERPs for mobile devices may not be the best displays.

In this talk, four key issues will be considered: 1) screen size matter for desktop monitors vs mobile devices; 2) search behaviour on different small screen sizes; 3) horizontal viewport control in mobile web search; 4) effect of snippet length on a mobile device.

The research has been conducted to provide ways to understand mobile web search behaviour, and the findings can be applied to a wide range of research areas such as information retrieval , human–computer integration, and even social science for a better presentation design of SERP on mobile devices.

Website: www.jaewonkim.net

12.10.02, 12.30 – 13:30
Thursday 26 October
TIGER & BDDA Poster Session
Amin Sadri
Mining Human Mobility Pattern from Smartphone Data
What will you do today? Full trajectory prediction

Joel Mackenzie
Managing Tail Latencies in Large Scale IR Systems

Dong Qin
Change detection from media sharing community

Mingzhao (Kane) Li
Supporting Large-scale Geographical Visualization in a Multi-granularity Way

Kevin Ong
QWERTY: The Effects of Typing on Web Search Behavior

Sheng Wang
Trip Planning by an Integrated Search Paradigm

Shiwei Zhang
Word-Character Convolutional Neural Network

Lishan Cui
Influential Real-World Event Detection on Twitter Stream

Hiqmat Nisa
Handwritten Exam Papers Recognition System for Blind Academics

Fatimah Abdullah Alqahtani
A low-Resource Arabic Dialect Classification: Hijazi Dialect

Suliman Aladhadh
Beyond the culture effect on credibility perception on microblogs

Jonathan Liono
Exploration of Situation Inference from Mobile Data Streams
A Ubiquitous daTa Exploration platform for mobile sensing experiments

Hui Song
Multi-resolution Selective Ensemble Extreme Learning Machine for Electricity Consumption Prediction


80.06.05, 11:00 – 12:00

Wednesday 25 October

Bruce Croft (University of Massachusetts)
[TIGER] Conversations about Search: Addressing Information Needs through Interaction

Conversational search is being discussed everywhere now, from conferences to trade publications. In this talk, I will discuss my own perspective on the interesting research challenges from an IR point of view. I will also present some preliminary results related to this area.

W. Bruce Croft is a distinguished professor of computer science at the University of Massachusetts Amherst whose work focuses on information retrieval. He is the founder of the Center for Intelligent Information Retrieval and served as the editor-in-chief of ACM Transactions on Information Systems from 1995 to 2002. He was also a member of the National Research Council Computer Science and Telecommunications Board from 2000 to 2003. Since 2015, he is the Dean of the College of Information and Computer Sciences at the University of Massachusetts Amherst. He was Chair of the UMass Amherst Computer Science Department from 2001 to 2007.


12.10.02, 12.30 – 13:30

Thursday 19 October

Jan Benetka (Norwegian University of Science and Technology)
[TIGER] Finding what you need with zero query terms (or less)

The Second Strategic Workshop in Information Retrieval in Lorne brought up a number of perspective directions in IR. In this talk, I’d like to review one of them: the zero-query information retrieval. Zero-query IR systems, such as Google Now, are proactive search systems that return results based on user’s context rather than as a reaction to her explicit query. One of the main benefits of this approach is providing low-cost information to the user when and where she needs it the most. On the other hand, many challenges remain that need to be addressed. Deciding which context to include, how to keep the system unobtrusive yet useful or how to perform evaluation are just a few of them.

12.10.02, 12.30 – 13:30

Thursday 12 October

Xinjue Wang
[BDDA] Influence-oriented Community Analysis in Social Networks

The emergence of online social networks has fundamentally changed the way people communicate with each other. Scholars has never ceased devoting their time and energy into the phenomenon since its emergence. Among researches around social network, one line of study that draw a large amount of attention recently is the discovery of communities, i.e. relatively densely connected sub-networks. Discovering such structures, or communities, provides insight into the relationship between individuals and composition of a social network. However, these studies mainly focus on the inner relationship between individuals inside a community structure and neglect the outward influence of a community as a whole. Therefore, there is a lack of studies that analyze communities on the aspect of social influence, which is fundamentally important for understanding the relationship between network structures and the information diffusion among it, and has many practical applications. For example, a company may try to find the most influential community to advertise their products; an organization may intend to initiate a campaign in hope to attract more diverse customers, i.e. a large number of communities instead of a large number of customers; an association may hope to minimize the influence of a malicious information spread by one of its opponents, so that the community consisted of its core customers would be affected the least. To fill in this meaningful blank, in this thesis, we intend to analyze communities on the aspect of social influence and solve three research questions as follows. First, how to identify the communities with the dense intra-connections and the highest outer influence to the users outside the communities? The second question is how to maximize both the spread and the diversity of the diffusion at the end of the information propagation by selecting k influential users from a social network to spread the information. (The higher diversity means more communities have been influenced.) Finally, how to minimize the influence of active nodes I, which has been infected by malicious information, on a target community T by deleting k edges in a social network?

To address the first research question, we propose a new metric to measure the likelihood of the community to attract the other users outside the community within the social network, i.e., the community’s outer influence. There are lots of applications that need to rank the communities using their outer influence, e.g., Ads trending analytics, social opinion mining and news propagation pattern discovery by monitoring the influential communities. We refer to such problem as Most Influential Community Search. While the most influential community search problem in large social networks is important in various applications, it is largely ignored by the academic research community. In this work, we systematically investigate this problem. Firstly, we propose a new community model, maximal kr-Clique community, which has desirable characters, i.e., society, cohesiveness, connectivity, and maximum. And then, we developed a novel tree-based index structure, denoted as C-Tree, to maintain the offline computed r-cliques. To efficiently search the most influential maximal kr-clique communities with the maximum outer influence, we developed four advanced index-based algorithms, which can improve the search performance of non-indexed solution by about 200 times. The efficiency and effectiveness of constructing index structure and evaluating the search algorithms have been verified using six real datasets including Facebook, Google+, Gowalla, Twitter, Youtube and Amazon. A small case study shows the value of the most influential communities using DBLP data.

In order to solve the second research question, we investigates Diverse Influence Maximization (DIM) to efficiently find k nodes which, at the end of propagation process, can maximize the the number of activated nodes and the diversity of the activated nodes. In this work, an evaluation metric has been proposed to balance the two objectives. To address the computational challenges, we develop two efficient algorithms and one advanced PSP-Tree index. The effectiveness and efficiency of our DIM solution have been verified by the extensive experimental studies on five real-world social network data sets.

To address the last research question, we study the community-targeted influence minimization problem. Unlike previous influence minimization work, this study considers the influence minimization in relation to a particular group of social network users, called targeted influence minimization. Thus, the objective is to protect a set of users, called target nodes, from malicious information originating from another set of users, called active nodes. This study also addresses two fundamental, but largely ignored, issues in different influence minimization problems: (i) the impact of a budget on the solution; (ii) robust sampling. To this end, two scenarios are investigated, namely unconstrained and constrained budget. Given an unconstrained budget, we provide an optimal solution; Given a constrained budget, we show the problem is NP-hard and develop a greedy algorithm with an (1 − 1/e)-approximation. More importantly, in order to solve the influence minimization problem in large, real-world social networks, we propose a robust sampling-based solution with a desirable theoretic bound. Extensive experiments using real social network datasets offer insight into the effectiveness and efficiency of the proposed solutions.

12.05.02, 12.30 – 13:30

Thursday 5 October

TIGER & BDDA Group Lunch
Hot food

We will have our group lunch this week, and please join us!

14.10.09, 12.30 – 13:30


Meetings in 2017 Q3
(Jump to: 17-Q417-Q317-Q217-Q116-Q416-Q3.)

Thursday 28 September
Paper discussion
[TIGER] The Probability That Your Hypothesis Is Correct, Credible Intervals, and Effect Sizes for IR Evaluation

In the next TIGER meeting we’ll have a paper reading and discussion session: Tetsuya Sakai. “The Probability That Your Hypothesis Is Correct, Credible Intervals, and Effect Sizes for IR Evaluation”. In Proc. SIGIR ’17.Ruey will walk us through the main content. Following our great tradition, please read the paper and be prepared to contribute to the discussion.

13.03.09, 12.30 – 13:30
Thursday 21 September
Ameer Albahem
[TIGER] An Axiomatic Analysis of Truncated and Diversified Ranking Evaluation Metrics

A lot of evaluation metrics have been proposed over the years to measure search result diversification. Some of these metrics are modified and reused in different search tasks with different set of complexities such as truncating a ranked list of documents where a search system returns lists of different length to users. Given such complexities, it is hard to study the behavior of these metrics. In this talk, we try to study these metrics using an axiomatic approach. That is we define fundamental properties (axioms) that are desirable in an evaluation metric, then we study which metrics satisfy these properties. An example of a basic and desirable property is that a ranking with only one relevant document should never score lower than a ranking with a one non-relevant document. Following this approach, we defined a set of axioms that focus on the diversity and truncation aspects of a ranking. Our initial analysis shows that the Average Cube Test does not satisfy some axioms and should be used with caution.

12.05.02, 12.30 – 13:30
Thursday 14 September
Amir Hossein Rouhi
[BDDA] Vector-based Models for Educational Institution Shape Analysis

Over the past 25 years, performance measurement has gained salience in higher education, and with the explosion of structured data and the impact of business analytics and intelligence systems, there are new angles by which big volumes of data can be analyzed. Using traditional analytical approaches, pairs of reciprocal cohorts are considered as two separate discrete entities; therefore, basis of analysis are individual pairs of values, using statistical measures such as average, sum, mean or median, of the total population. Missing in traditional approaches is the lack of a holistic performance measure in which the shape of the comparable cohorts is being compared to the overall cohort population (vector-based analysis). The purpose of this research is to examine shape analysis, using a Cosine similarity measure to distil new perspectives on performance measures in higher education. Cosine similarity measures the angle between the two vectors, regardless of the impact of their magnitude. Therefore, the more similar behavior of the two comparing entities can be interpreted as more similar orientation, i.e. load pattern distribution, between the two vectors. The efficacy of the proposed method is experimented on a college of RMIT University from 2010 to 2016. The current research also proposed two other distance measures: Euclidean and Manhattan distances. The experimental results provide new insights to analyzing patterns of student load distribution and provide additional angles by orientation instead of magnitude / volume comparison. These insights assist University executive to be assured of the decision making process.

12.05.02, 12.30 – 13:30
Friday 8 September
Toshinori (Kwansei Gakuin University, Japan)
[TIGER] Proposal of Recommendation System based on Dissatisfaction Data and Review Data for E-Commerce

Recently e-commerce product recommendation system is widespread not only with user purchase history, but methods taking value estimates and comment reviews entered post-purchase into consideration to recommend higher-rated related products. Related products, such as accessories, will be recommended by system, but it will not recommend substitutes which solve the problem of original items. Therefore, the user must choose some options (screen quality and size, for example) on the existing service, which makes it difficult to cater to the user’s needs. In our research, we propose a novel system to recommend items that solves the problems of other items chosen by the user by using two kinds of data from Amazon and Fumankaitori Center, which collects the complaint data.

In this system, it generates the vectors of the items which the user checks. Then, it also generates the vectors of item reviews by subtracting lower-rated review from higher-rated review. Next, it calculates the similarity of these 2 kinds of vectors, and finds the review which solve the problem of complaint of original item. Thus, this system provides the substitute to solve the problem of the item user checked. In our research, we describe the recommendation methods based on complaints data and review data, and verified it by qualitative evaluation.

12.12.02, 12.30 – 13:30
Thursday 31 August
Abu Shamim Mohammad Arif (Khulna University, Bangladesh)
[TIGER] Collaborative Information Search in Tourism: A Study of User Search Behavior and Interface

Collaboration is necessary when a task is too complex and difficult to perform solitary. Many search situations demand people working together to solve a common goal. Tourism information searching is not an exception. Web searching has increasingly become a prevalent channel for tourists to conduct search for information. A group-based traveling information search activity includes preparing a travel plan, dividing the task among the group members, developing different strategies, seeking various information, exchanging views and ideas and decision-making. In addition to these features, communication among collaborators is also important. All these actions demand collaboration among the group members. However, despite its potential appeal and necessity, collaborative tourism information searching on the Web is under-explored. This study investigated tourists’ collaborative search behavior including collaborative search stages, search strategies, nature of information sharing, and problems encountered during collaborative search. It attempts to develop a collaborative tourism information search tool (ColTIS). This research evaluated the performance and effectiveness of ColTIS by comparing with a simulated collaborative search tool – Google Talk-embedded Tripadvisor.com. The results demonstrated that ColTIS performs better than Google Talk-embedded Tripadvisor.com in terms of usability, result satisfaction and result correctness. The findings of this research may further help the development or improvement of tourism CIS tools with a more comprehensive understanding of tourism CIS processes.

Abu Shamim Mohammad Arif (Shamim) is a faculty member of Computer Science and Engineering Discipline (Department) in the School of Science, Engineering and Technology, Khulna University, Khulna, Bangladesh. Abu Shamim Mohammad Arif holds a PhD in Computer and Information Science from University of South Australia, Australia. His research interests are on understanding of collaborative information search, collaborative query reformulation, information search behavior, and Web interaction. Currently he is a visiting research fellow at Information Technology and Mathematical Sciences (ITMS) School, University of South Australia, (UniSA), Australia.

09.03.11, 12.30 – 13:30
Thursday 24 August
Mahdi Jalili
[TIGER] New results on Recommender Systems: Accuracy-novelty dilemma, time-aware and trust-based recommendations

Recommender Systems have many applications in both academia and industry. They are designed based on availability of users-items interaction data (e.g. rating history of users on items) and contextual data. Traditionally, recommender systems have been designed to have high accuracy on the data. However, users often expect to be recommended items that not only match their taste, but also be novel to them. However, accuracy and novelty do not often go hand in hand; novelty decreases by increasing the accuracy and vice versa. In this talk, I will present some new results in designing efficient recommendation system with reasonable accuracy and novelty levels. I will also present our new time-aware and trust-based recommendation algorithms.

14.06.19, 12.30 – 13:30
Wednesday 16 August
Paul Bennett (Microsoft Research AI)
[TIGER] From Contextual Search to Contextual Intelligence

User and behavioral modeling plays a critical role in a variety of online services such as web search, advertising, e-commerce, and news recommendation. For example, our ability to accurately interpret the intent of a web search can be informed by knowledge of the web pages a searcher was viewing when initiating the search or recent actions of the searcher such as queries issued, results clicked, and pages viewed. In this talk, I will describe a framework for personalized search which improves the quality of search results by enabling a representation of a broad variety of context including the searcher’s long-term interests, recent activity, current focus, and other user characteristics. Then, I will describe how we can step beyond the interaction logs of web queries and clicks to model user intent more generally by incorporating signals from email and reminder logs. Finally, I will demonstrate how such models can provide the type of contextual intelligence needed in next-generation virtual assistants to enable these assistants to both know when to take action proactively and when to defer action to a more appropriate time.

Paul Bennett is a Senior Researcher in the Information & Data Sciences group at Microsoft Research AI where he focuses on the development, improvement, and analysis of machine learning and data mining methods as components of real-world, large-scale adaptive systems or information retrieval systems. His current focus is on contextually intelligent assistants. His research has advanced techniques for ensemble methods and the combination of information sources, calibration, consensus methods for noisy supervision labels, active learning and evaluation, supervised classification (with an emphasis on hierarchical classification) and ranking with applications to information retrieval, crowdsourcing, behavioral modeling and analysis, and personalization. His work has been recognized with awards at SIGIR, CHI, and ACM UMAP. He completed his dissertation on combining text classifiers using reliability indicators in 2006 at Carnegie Mellon where he was advised by Profs. Jaime Carbonell and John Lafferty.

80.04.06, 18.00 – 19:00
Thursday 3 August
Evi Yulianti
[TIGER] Re-ranking Documents Using Answer-biased Summaries

Ranking documents in response to a query is a fundamental problem in information retrieval. Text quality evidence derived from document summaries can potentially be a useful input to the ranking process. We propose to combine features extracted from answer-biased summaries (document summaries likely to bear answers to a query) into ranking models. In particular, CQA (Community Question Answering) data is used as an external resource to induce these summaries. Our results show that incorporating such features can give an improvement over state-of-the-art ranking techniques.

80.08.07, 12.30 – 13:30
Thursday 10 August
RMIT people are having talks about 9 accepted papers at SIGIR’17
Thursday 27 July
Kevin Ong
[TIGER] Using Information Scent to understand Mobile and Desktop web search behavior

This paper investigates if Information Foraging Theory can be used to understand differences in user behavior when searching on mobile and desktop web search systems. Two groups of thirty-six participants were recruited to carry out six identical web search tasks on desktop or on mobile. The search tasks were prepared with a different number and distribution of relevant documents on the first result page. Search behaviors on mobile and desktop were measurably different. Desktop participants viewed and clicked on more results but saved fewer as relevant, compared to mobile participants, when information scent level increased. Mobile participants achieved higher search accuracy than desktop participants for tasks with increasing numbers of relevant search results. Conversely, desktop participants were more accurate than mobile participants for tasks with an equal number of relevant results that were more distributed across the results page. Overall, both an increased number and better positioning of relevant search results improved the ability of participants to locate relevant results on both desktop and mobile. Participants spent more time and issued more queries on desktop, but abandoned less and saved more results for initial queries on mobile.

80.08.07, 12.30 – 13:30
Thursday 20 July
TIGER & BDDA Group Lunch
Hot food

This week, we will have our group lunch (Hot Food). Please join us!

14.10.09, 12.30 – 13:30
Thursday 13 July
Paolo Rosso (Universitat Politècnica de València in Spain)
[TIGER] Discourse analysis in author profiling

Author profiling is the study of how language is shared by people, a problem of growing importance in applications dealing with security, in order to understand who could be behind an anonymous threat message, and marketing, where companies may be interested in knowing the demographics of people that in online reviews liked or disliked their products. In this talk we will give an overview of the PAN (http://pan.webis.de/) shared tasks that since 2013 have been organised at CLEF and FIRE evaluation forums on author profiling, mainly on age and gender identification in social media (although also personality recognition in Twitter as well as in code sources were also addressed). We will also describe the way we approached age and gender identification modelling the way people use the language to express themselves on the basis of a discourse-labelled graph.

Paolo Rosso works on Plagiarism detection, Short texts analysis, Irony detection in social media Author profiling in social media, Question answering.

Some links to give you an idea of what Paolo works on.

12.07.02, 12.30 – 13:30
Thursday 06 July
Ziying Yang (University of Melbourne)
[TIGER] Relevance Judgments: Preferences, Scores and Ties

Conventionally, relevance judgments were assessed using ordinal relevance scales such as binary and Sormunen categories. Such judgments record how much overlap there is between the document and the topic. However they have been argued as unreliable and not objective because: (1) documents are usually assessed by limited numbers of experts, with different viewpoints of relevance because of individual factors such as gender, age and background; (2) the distinctions of relevance levels expected by users disparate types may be diverse; (3) assessors’ examining criteria drift in varying degrees as more documents are judged; (4) many judgment ties are generated using ordinal scales. In order to have a better understanding of users’ perceptions of relevance and collect data with high fidelity, we propose to use the Pairwise Preference technique to collect relevance judgments from a crowdsourcing platform. We will involve a well-designed forced choice testing and some quality control process to collect preference judgments which only record which document is preferred by assessors in a document pair. With the collected judgments, a computed rank list containing all judged documents for each topic will be generated with the goal of having fewer relevance ties.

The collected judgments will be compared with three relevance scales: NIST binary; Sormunen judgments; and Magnitude Estimation. The comparisons will be in terms of document relevance, system ordering and cost. We will use ANOVA to analyze relevance judgment variations with dependent variables topic, assessor, document and judgment technique, in order to determine the relative weight of the factors that a ect relevance assessment. Further, we will discover how the collected judgments can be mapped to normalized numeric relevance scores for documents in the computed list. The distribution of normalized relevance scores will provide additional knowledge of user perceptions of relevance. The gain profiles found by collected judgments and Magnitude Estimation will be compared. The finding will suggest how gain functions of metrics can be refined.

14.10.09, 12.30 – 13:30

Meetings in 2017 Q2
(Jump to: 17-Q417-Q317-Q217-Q116-Q416-Q3.)

Thursday 29 June
Xiaolu Lu
[TIGER] Efficient and Effective Retrieval using Proximity Features

Higher-order ranking models have been widely used in information retrieval due to their effectiveness. Among all of the features used in those models, proximity is important and has been intensively studied in the literature. In this research, we address questions emerging from the study of using proximity features in order to improve retrieval quality, which falls into three categories: (i) In terms of effectiveness, we empirically showed that the effectiveness can be improved when using proximities, when compared to a bag-of-word model and it may lead to less degradation in effectiveness compared to using phrase-based features. (ii) From an efficiency perspective, the computational cost of integrating this feature into existing models can be significant. We propose and empirically show that a batch plane-sweep algorithm can be used to reduce the cost within one document, and a BM25-based MRF model can avoid the collection-level statistics in order to reduce the retrieval cost, but with little loss of effectiveness; (iii) In order to be able to fairly compare systems on large collections, we examined several commonly used metrics in terms of the evaluation depth and pooling depth, and conclude that using a non-recall based metric in addition to recall based metrics is necessary, especially on large collection when a limited pooling depth is applied.

80.06.05, 13.00 – 14:00
Thursday 22 June
Tadele Tedla Damessie
[TIGER] Gauging the Quality of Relevance Assessments using Inter-Rater Agreement

In recent years, gathering relevance judgments through non-topic originators has become an increasingly important problem in Information Retrieval. Relevance judgments can be used to measure the effectiveness of a system, and are often needed to build supervised learning models in learning-to-rank retrieval systems. The two most popular approaches to gathering bronze level judgments — where the judge is not the originator of the information need for which relevance is being assessed, and is not a topic expert — is through a controlled user study, or through crowdsourcing. However, judging comes at a cost (in time, and usually money) and the quality of the judgments can vary widely. In this work, we directly compare the reliability of judgments using three different types of bronze assessor groups. Our first group is a controlled Lab group; the second and third are two different crowdsourcing groups, CF-Document where assessors were free to judge any number of documents for a topic, and CF-Topic where judges were required to judge all of the documents from a single topic, in a manner similar to the Lab group. Our study shows that Lab assessors exhibit a higher level of agreement with a set of ground truth judgments than CF-Topic and CF-Document assessors. Inter-rater agreement rates show analogous trends. These finding suggests that in the absence of ground truth data, agreement between assessors can be used to reliably gauge the quality of relevance judgments gathered from secondary assessors, and that controlled user studies are more likely to produce reliable judgments despite being more costly.

Xiaolu Lu
[TIGER] Can Deep Effectiveness Metrics Be Evaluated Using Shallow Judgment Pools?

Increasing test collection sizes and limited judgment budgets create measurement challenges for IR batch evaluations, challenges that are greater when using deep effectiveness metrics than when using shallow metrics, because of the increased likelihood that unjudged documents will be encountered. Here we study the problem of metric score adjustment, with the goal of accurately estimating system performance when using deep metrics and limited judgment sets, assuming that dynamic score adjustment is required per topic due to the variability in the number of relevant documents. We seek to induce system orderings that are as close as is possible to the orderings that would arise if full judgments were available.

Starting with depth-based pooling, and no prior knowledge of sampling probabilities, the first phase of our two-stage process computes a background gain for each document based on rank-level statistics. The second stage then %% adds results from previous techniques, and accounts for the distributional variance of relevant documents. We also exploit the frequency statistics of pooled relevant documents in order to determine a threshold for dynamically determining the set of topics to be adjusted. Taken together, our results show that: (i) better score estimates can be achieved when compared to previous work; (ii) by setting a global threshold, we are able to adapt our methods to different collections; and (iii) the proposed estimation methods reliably approximate the system orderings achieved when many more relevance judgments are available. We also consider pools generated by a two-strata sampling approach.

12.10.02, 12.30 – 13:30
Thursday 15 June
Luke Gallagher
[TIGER] Efficient Cost-Aware Cascade Ranking in Multi-Stage Retrieval

Complex machine learning models are now an integral part of modern, large-scale retrieval systems. However, collection size growth continues to outpace advances in efficiency improvements in the learning models which achieve the highest effectiveness.

In this paper, we re-examine the importance of tightly integrating feature costs into multi-stage learning-to-rank (LTR) IR systems. We present a novel approach to optimizing cascaded ranking models which can directly leverage a variety of different state-of-the-art LTR rankers such as LambdaMART and Gradient Boosted Decision Trees. Using our cascade model, we conclusively show that feature costs and the number of documents being re-ranked in each stage of the cascade can be balanced to maximize both efficiency and effectiveness. Finally, we also demonstrate that our cascade model can easily be deployed on commonly used collections to achieve state-of-the-art effectiveness results while only using a subset of the features required by the full model

Evi Yulianti
[TIGER] Improving Document Ranking Using Answer-biased Summaries

Ranking documents in response to a query is a fundamental problem in information retrieval. Text quality evidence derived from document summaries can potentially be a useful input to the ranking process. We propose to combine features extracted from answer-biased summaries (document summaries likely to bear answers to a query) into ranking models. In particular, CQA (Community Question Answering) data is used as an external resource to induce these summaries. Our results show that incorporating such features can give an improvement over state-of-the-art ranking techniques.

12.10.02, 12.30 – 13:30
Thursday 8 June
Mark Sanderson
[TIGER] Examining Additivity and Weak Baselines

I will talk about a study of which baseline to use when testing a new retrieval technique. In contrast to past work, we show that measuring a statistically significant improvement over a weak baseline is not a good predictor of whether a similar improvement will be measured on a strong baseline. Indeed, sometimes strong baselines are made worse when a new technique is applied. We investigate whether conducting comparisons against a range of weaker baselines can increase confidence that an observed effect will also show improvements on a stronger baseline. Our results indicate that this is not the case – at best, testing against a range of baselines means that an experimenter can be more confident that the new technique is unlikely to significantly harm a strong baseline. Examining recent past work, we present evidence that the IR community continues to test against weak baselines. This is unfortunate, as in the light of our experiments we conclude that the only way to be confident that a new technique is a contribution is to compare it against, nothing less than the state of the art.

12.10.02, 12.30 – 13:30
Thursday 1 June
Rosni Lumbantoruan
[BDDA] Local Item-Item Models for Top-N Recommendation

Item-based approaches based on SLIM (Sparse LInear Methods) have demonstrated very good performance for top-N recommendation; however they only estimate a single model for all the users. This work is based on the intuition that not all users behave in the same way — instead there exist subsets of like-minded users. By using different item-item models for these user subsets, we can capture differences in their preferences and this can lead to improved performance for top-N recommendations. In this work, we extend SLIM by combining global and local SLIM models. We present a method that computes the prediction scores as a user-specific combination of the predictions derived by a global and local item-item models. We present an approach in which the global model, the local models, their user-specific combination, and the assignment of users to the local models are jointly optimized to improve the top-N recommendation performance. Our experiments show that the proposed method improves upon the standard SLIM model and outperforms competing top-N recommendation approaches.

12.10.02, 12.30 – 13:30
Thursday 25 May
TIGER & BDDA Group Lunch
Hot food

This week, we will have our group lunch (Hot Food). Please join us!

80.09.09, 12.30 – 14:00
Thursday 18 May
[BDDA] Effective Indexing for Approximate Constrained Shortest Path Queries on Large Road Networks

In a constrained shortest path (CSP) query, each edge in the road network is associated with both a length and a cost. Given an origin s, a destination t, and a cost constraint θ, the goal is to find the shortest path from s to t whose total cost does not exceed θ. Because exact CSP is NP-hard, previous work mostly focuses on approximate solutions. Even so, existing methods are still prohibitively expensive for large road networks. Two main reasons are (i) that they fail to utilize the special properties of road networks and (ii) that most of them process queries without indices; the few existing indices consume large amounts of memory and yet have limited effectiveness in reducing query costs. Motivated by this, we propose COLA, the first practical solution for approximate CSP processing on large road networks. COLA exploits the facts that a road network can be effectively partitioned, and that there exists a relatively small set of landmark vertices that commonly appear in CSP results. Accordingly, COLA indexes the vertices lying on partition boundaries, and applies an on-the-fly algorithm called α-Dijk for path computation within a partition, which effectively prunes paths based on landmarks. Extensive experiments demonstrate that on continent-sized road networks, COLA answers an approximate CSP query in sub-second time, whereas existing methods take hours. Interestingly, even without an index, the α-Dijk algorithm in COLA still outperforms previous solutions by more than an order of magnitude.

12.11.19, 12.30 – 13:30
Thursday 11 May
Liangjun Song
[BDDA] Big Data Integration and Entity Matchings

The Big Data era is upon us: data is being generated, collected and analyzed at an unprecedented scale, and data driven decision making is sweeping through all aspects of society. Since the value of data explodes when it can be linked and fused with other data, addressing the big data integration (BDI) challenge is critical to realizing the promise of Big Data. BDI differs from traditional data integration in many dimensions: (i) the number of data sources, even for a single domain, has grown to be in the tens of thousands, (ii) many of the data sources are very dynamic, as a huge amount of newly collected data are continuously made available, (iii) the data sources are extremely heterogeneous in their structure, with considerable variety even for substantially similar entities, and (iv) the data sources are of widely differing qualities, with significant differences in the coverage, accuracy and timeliness of data provided. This seminar explores the progress that has been made by the data integration community on the topics of schema mapping, record linkage and data fusion in addressing these novel challenges faced by big data integration, and identifies a range of open problems for the community.

12.11.19, 12.30 – 13:30
Thursday 4 May
Paper discussion
[TIGER] Neural Ranking Models with Weak Supervision

Despite the impressive improvements achieved by unsupervised deep neural networks in computer vision and NLP tasks, such improvements have not yet been observed in ranking for information retrieval. The reason may be the complexity of the ranking problem, as it is not obvious how to learn from queries and documents when no supervised signal is available. Hence, in this paper, we propose to train a neural ranking model using weak supervision, where labels are obtained automatically without human annotators or any external resources (e.g., click data). To this aim, we use the output of an unsupervised ranking model, such as BM25, as a weak supervision signal. We further train a set of simple yet effective ranking models based on feed-forward neural networks. We study their effectiveness under various learning scenarios (point-wise and pair-wise models) and using different input representations (i.e., from encoding query-document pairs into dense/sparse vectors to using word embedding representation). We train our networks using tens of millions of training instances and evaluate it on two standard collections: a homogeneous news collection(Robust) and a heterogeneous large-scale web collection (ClueWeb). Our experiments indicate that employing proper objective functions and letting the networks to learn the input representation based on weakly supervised data leads to impressive performance, with over 13% and 35% MAP improvements over the BM25 model on the Robust and the ClueWeb collections. Our findings also suggest that supervised neural ranking models can greatly benefit from pre-training on large amounts of weakly labeled data that can be easily obtained from unsupervised IR models.


12.11.19, 12.30 – 13:30
Thursday 27 April
TIGER & BDDA Group Lunch

We will have our group lunch. Please join us!

56.07.91, 12.30 – 14:00
Thursday 20 April
Ruey-Cheng Chen
[TIGER] Non-Factoid Question Answering: Then and Now

Non-factoid question answering is a research focused on retrieving passages from webpages that are likely to address users’ natural language queries. This technique grants search engines the capability to go beyond knowledge graphs and derive answers to complex (and sometimes opinionated) questions from a massive amount of unstructured texts. In this talk, I will cover some backgrounds and advances in this area, and discuss how the core techniques developed for non-factoid questions can be used to improve recent factoid question answering models.

12.11.19, 12.30 – 13:30
Thursday 13 April
Paper discussion
[TIGER] Deep-Learning Reading Session

Following the deep learning session two weeks ago, we are going to have another deep-learning reading session. Specifically, we are going to discuss the following paper:


Please read the paper before attending the meeting. Thanks.

12.07.02, 12.30 – 13:30
Thursday 6 April
Mingzhao Li
[BDDA] Hashedcubes: Simple, low memory, real-time visual exploration of big data.

We propose Hashedcubes, a data structure that enables real-time visual exploration of large datasets that improves the state of the art by virtue of its low memory requirements, low query latencies, and implementation simplicity. In some instances, Hashedcubes notably requires two orders of magnitude less space than recent data cube visualization proposals. In this paper, we describe the algorithms to build and query Hashedcubes, and how it can drive well-known interactive visualizations such as binned scatterplots, linked histograms and heatmaps. We report memory usage, build time and query latencies for a variety of synthetic and real-world datasets, and find that although sometimes Hashedcubes offers slightly slower querying times to the state of the art, the typical query is answered fast enough to easily sustain a interaction. In datasets with hundreds of millions of elements, only about 2% of the queries take longer than 40ms. Finally, we discuss the limitations of data structure, potential spacetime tradeoffs, and future research directions.

12.11.19, 12.30 – 13:30

Meetings in 2017 Q1
(Jump to: 17-Q417-Q317-Q217-Q116-Q416-Q3.)

Thursday 30 March
TIGER & BDDA Group Lunch

Kindly supported by Prof. Mark Sanderson, we will have our TIGER/BDDA group lunch. It is free for you, but as part of our group meeting, you are expected to spend 2 minutes to talk your own recent research. So, please get yourself prepared, and enjoy the lunch.

80.09.09, 12.30 – 13:30
Thursday 23 March
Zhuyun Dai (CMU)
[TIGER] Query-biased Partitioning for Selective Search

Selective search is a type of efficient distributed retrieval that partitions a corpus into topical index shards, and searches just a few shards per query. It forms topical shards by clustering the corpus based on the documents’ contents. However, the topic distribution produced by clustering may not match the distribution of topics in search traffic, which reduces the effectiveness of selective search. In this talk, I will discuss using query logs to guide the partitioning process for selective search. The work explores 1) a clustering initialization algorithm which uses topics from query logs to form cluster seeds, and 2) a clustering similarity metric which favors terms that are important in query logs. Results for both methods will be provided and discussed in detail.

Zhuyun Dai is a PhD student of the Language Technologies Institute under the School of Computer Science at Carnegie Mellon University. Zhuyun works with Prof. Jamie Callan on Information Retrieval research. Her research interests includes deep learning for text, distributed information retrieval and federated search.

12.11.19, 12.30 – 13:30
Thursday 16 March
Yubin Kim (CMU)
[TIGER] Challenges of Selective Search

Selective search is a modern distributed search architecture designed to reduce the computational cost of large-scale search. This architecture comes with a unique set of challenges, some of which are addressed are in this talk; we discuss load balancing for topical shards, answer questions on combining selective search with dynamic pruning, and investigate the effects of random decisions in the system on the accuracy of the final search results.

Yubin Kim is a PhD Candidate from Carnegie Mellon University advised by Prof. Jamie Callan in the Language Technologies Institute. Her thesis work is on distributed and federal search, and she has previously worked on twitter search and crowdsourcing for information retrieval. She has interned at many companies including Microsoft Research, Google and A9.com.

09.03.11, 12.30 – 13:30
Thursday 9 March
Zhiyong Wang (University of Sydney)
[TIGER] Beyond video search: video summarization

It has become more and more demanding for users to quickly comprehend video content, while big multimedia data is growing exponentially. Video summarization, also known as video abstracting, extracts the essential information of a video to produce a compact and informative version. In this talk, he will present recent studies of his team in this field. While most of the existing methods rely on global visual features to characterize each video frame, they for the first time formulate the video summarization task as a keypoint selection problem from a local feature point of view. Based on this new perspective, they develop a new keypoint coverage approach and a novel Bag-of-Importance (BoI) model for static video summarization. He will demonstrate their state-of-the-art results, as well as discuss potential directions of this topic.

Zhiyong Wang is an Associate Professor and Associate Director of Multimedia Laboratory at the School of Information Technologies, University of Sydney. He received his BEng and MEng degrees in Electronic Engineering from South China University of Technology, Guangzhou, China, and his PhD degree from Hong Kong Polytechnic University, Hong Kong. During his PhD research on multimedia information retrieval and management, he pioneered image based plant identification and structural representation of image content for image retrieval and classification, in particular for plants of Traditional Chinese Medicine, which can be generally extended to a wide range of applications in agriculture and other natural resource management. He also made landmark contributions on video summarization. His research interests generally include multimedia information processing, retrieval and management, Internet-based multimedia data mining, and pattern recognition. His research outcomes have contributed to fundamental multimedia computing theories and applications in many domains such as geoscience, security, health care, and social science.

09.03.11, 12.30 – 13:30
Thursday 2 March
TIGER & BDDA Group lunch
Donut meeting

This week’s TIGER/BDDA session will be a donut meeting, where we will discuss

1) how to spend some extra money this year, and
2) the plans for the research centre.

14.10.09, 12.30 – 13:30
Thursday 16 February
Farhana Choudhury
[BDDA] A completion seminar for Farhana

As a result of the increasing popularity of GPS enabled mobile devices, the volume of content associated with both a geographic location and a text description is growing rapidly on the web. The availability of this data enables us to answer many real-life spatial and textual queries for different applications. This research addresses the increasing demand of answering real-life queries that are useful, especially in business and marketing applications. In addition, the need to improve the performance while processing a large number of queries together arises in many problems. Specifically, the aim of this research is: (i) Answer a novel query useful in many practical applications, called the MaxBRSTkNN query that. A MaxBRSTkNN query finds the location and the text description of a specific product or service such that the product is one of the top-k spatial-textual relevant objects of the maximum number of users. (ii) Efficiently process a set of queries simultaneously as a batch, which is crucial for the performance of various applications, including the MaxBRSTkNN query; and (iii) Address the problem of top-m rank aggregation for streaming queries, where, given a set of objects O, a stream of spatial queries that requires a partial ranking of O, the problem is to continuously report the m objects with the highest aggregate rank. As the result objects are highly ranked by a large number of users, these results can be suggested to the new users as “trending items”.

Tadele Tedla Damessie
[TIGER] Topic Difficulty Effects in Information Retrieval Test Collection Construction

Test collections are the most widely used methodology for the evaluation of information retrieval system effectiveness, and a key component of the approach is a set of relevance judgments, indicating which documents should be considered to be appropriate answers in response to a topic. There exist a number of standard test collections in information retrieval. Text REtrieval Conference (TREC) and NII Test Collection for Information Retrieval (NTCIR) are two examples; and these test collections have documents, topics and relevance judgments. The relevance judgment is usually developed iteratively by human assessors using a method commonly known as pooling. Pooling is the process of collecting the top-K retrieved documents from different retrieval models of a topic; and determining the relevance of these documents using human assessors.

Given the popularity use of test collections in the evaluation of retrieval systems, their reliability is of paramount importance to evaluate conclusively the effectiveness of one retrieval model over another. One way of insuring such is to investigate factors which could influence assessor relevance judgment which is the main task in the construction of test collection; and in this study, we will investigate the effects topic difficulty has on assessors relevance judgment.

14.10.09, 12.00 – 13:30
Thursday 2 February
Joel Mackenzie
[TIGER] A Comparison of Document-at-a-Time and Score-at-a-Time Query Evaluation

We present an empirical comparison between document-at-a-time (DaaT) and score-at-a-time (SaaT) document ranking strategies within a common framework. Although both strategies have been extensively explored, the literature lacks a fair, direct comparison: such a study has been difficult due to vastly different query evaluation mechanics and index organizations. Our work controls for score quantization, document processing, compression, implementation language, implementation effort, and a number of details, arriving at an empirical evaluation that fairly characterizes the performance of three specific techniques:WAND (DaaT), BMW (DaaT), and JASS (SaaT). Experiments reveal a number of interesting findings. The performance gap between WAND and BMW is not as clear as the literature suggests, and both methods are susceptible to tail queries that may take orders of magnitude longer than the median query to execute. Surprisingly, approximate query evaluation in WAND and BMW does not significantly reduce the risk of these tail queries. Overall, JASS is slightly slower than either WAND or BMW, but exhibits much lower variance in query latencies and is much less susceptible to tail query effects. Furthermore, JASS query latency is not particularly sensitive to the retrieval depth, making it an appealing solution for performance-sensitive applications where bounds on query latencies are desirable.

12.10.02, 12.30 – 13:30
Thursday 19 January
Zhen He (La Trobe University)
[TIGER] A tutorial on deep learning for natural language processing

There are has been a lot of rapid advancement in deep learning algorithms for many different application domains including computer vision, natural language processing (NLP), speech recognition, language translation, etc. So far the biggest break through has been in computer vision and speech recognition. However, many of the experts say NLP will be the domain where the next big breakthrough will occur. In this tutorial I will start by presenting how deep learning can be used for simpler tasks such as language modeling and classification. Then move to more complex tasks such as language translation and question and answering. I will present the following three key enabling deep learning technologies used to accomplish the previously mentioned tasks: neural network based word embeddings; recurrent neural networks; and memory networks. Finally I would like to explore with the audience how some of these technologies can be used for information retrieval applications.

Zhen He is an Associate Professor in the department of computer science from La Trobe University. His main research interest is in deep learning. He leads a deep learning research group at La Trobe university. The group is focused on working on three different data types: video, image and text. For video the group is working with the Australian Institute of Sport (AIS) on action recognition in a number of different sports. For image the group is working with the Alfred hospital on cancer diagnosis from CT scans of lungs. For text the group is working with Zendesk on converting text from support tickets into powerful latent representations that can be used for a number of different tasks.

08.11.06, 12.30 – 13:30
Thursday 12 January
Dong Qin
[BDDA] Two main approaches to recommender systems

Recommender systems provide users with personalized suggestions for products or services. In this tutorial, the speaker will review two main methods: Content-based Filtering and Collaborating Filtering.

Content-based filtering (CBF) has its roots in information retrieval and information filtering research. It tries to recommend items that are similar to those that a user liked in the past. The speaker will introduce the main concepts and procedures of content-based filtering by an example.

Collaborating Filtering (CF) analyses past transactions to establish connections between users and products. The speaker will introduce two main implementations (latent factor models and neighborhood models) of CF and analyse their theoretical bases. After that a brief discuss on paper “Factorization Meets the Neighborhood: a Multifaceted Collaborative Filtering Model – KDD 08” helps to understand CF further.

Finally, the compararison of these two main methods will be discussed as well as some open problems in recommender system.

08.11.06, 12.30 – 13:30

Meetings in 2016 Q4
(Jump to: 17-Q417-Q317-Q217-Q116-Q416-Q3.)

Thursday 15 December
Ian Munro (University of Waterloo)
[TIGER] Optimal search trees with 2-way comparisons

This talk is about finding a polynomial time algorithm that you probably thought was known almost a half century ago, but it wasn’t. The polynomial time algorithm is still rather slow and requires a lot of space to solve, so we also look at extremely good and fast approximate solutions. More specifically … In 1971, Knuth gave an O(n2)-time algorithm for the now classic problem of finding an optimal binary search tree. Knuth’s algorithm works only for search trees based on 3-way comparisons, but most modern programming languages and computers support only 2-way comparisons (<, = and >). Until this work, the problem of finding an optimal search tree using 2-way comparisons remained open — polynomial time algorithms were known only for restricted variants. We solve the general case, giving (i) an O(n4)-time algorithm and (ii) a linear time algorithm that gives a tree with expected search cost within 2 comparisons of the optimal. This is joint work with Marek Chrobak, Mordecai Golin, and Neal E. Young

Ian Munro is University Professor and Canada Research Chair in Algorithm Design in the D.R. Cheriton School of Computer Science at the University of Waterloo, where he has been a faculty member since completing his PhD at the University of Toronto in 1971. His research has concentrated on the efficiency of algorithms and data structures. He has authored about 150 research papers and supervised more than a dozen Ph.D.’s on the subject. Dr. Munro has held visiting positions at a number of major universities and research labs, including AT&T Bell Labs, Princeton University and the Max Planck Insitute for Informatics. His consulting activities have included work with government and several major computer companies. He has served on the editorial boards of CACM, Inf & Comp, and B.I.T., and the program committees of most of the major conferences in his area. He is a former Director of Waterloo’s Institute for Computer Research. He is presently a member of the board of the Centre for Education in Mathematics and Computing. He was elected Fellow of the Royal Society of Canada in 2003 and made University Professor in 2006.

80.09.09, 12.30 – 13:30
Thursday 1 December
Damiano Spina
[TIGER] Optimal Conversational Search over a Speech-Only Communication Channel: Challenges and Opportunities

While the ever-growing use of smartphones, wearable devices and connected cars has resulted in speech interaction becoming increasingly pervasive, some critical tasks, such as delivery of information search results, are currently poorly handled over a speech-only channel. Nowadays, SERPs (Search Engine Result Pages) in screen-based search interfaces include much more information than the traditional ‘ten blue links’. How to effectively deliver rich search result information to users in a speech-only interface is an open problem. In this talk, I will introduce some of the challenges and opportunities that we have identified so far in speech-only conversational search.

Tadele Tedla Damessie
[TIGER] The Influence of Topic Difficulty, Relevance Level, and Document Ordering on Relevance Judging

Judging the relevance of documents for an information need is an activity that underpins the most widely-used approach in the eval- uation of information retrieval systems. In this study we investi- gate the relationship between how long it takes an assessor to judge document relevance, and three key factors that may influence the judging scenario: the difficulty of the search topic for which rel- evance is being assessed; the degree to which the documents are relevant to the search topic; and, the order in which the documents are presented for judging. Two potential confounding influences on judgment speed are differences in individual reading ability, and the length of documents that are being assessed. We therefore pro- pose two measures to investigate the above factors: normalized pro- cessing speed (NPS), which adjusts the number of words that were processed per minute by taking into account differences in reading speed between judges, and normalized dwell time (NDT), which adjusts the duration that a judge spent reading a document relative to document length. Note that these two measures have different relationships with overall judgment speed: a direct relationship for NPS, and an inverse relationship for NDT.

The results of a small-scale user study show a statistically sig- nificant relationship between judgment speed and topic difficulty: for easier topics, assessors process more quickly (higher NPS), and spend less time overall (lower NDT). There is also a statistically significant relationship between the level of relevance of the doc- ument being assessed and overall judgment speed, with assessors taking less time for non-relevant documents. Finally, our results suggest that the presentation order of documents can also affect overall judgment speed, with assessors spending less time (smaller NDT) when documents are presented in relevance order than do- cID order. However, these ordering effects are not significant when also accounting for document length variance (NPS).

80.09.07, 12.30 – 13:30
Thursday 24 November
Sheng Wang
[BDDA] Answering Top-k Exemplar Trajectory Query

We study a new type of spatial-textual trajectory search: the Exemplar Trajectory Query (ETQ), which specifies one or more places to visit, and descriptions of activities at each place. Our goal is to efficiently find the top-k trajectories by computing spatial and textual similarity at each point. The computational cost for pointwise matching is significantly higher than previous approaches. Therefore, we introduce an incremental pruning baseline and explore how to adaptively tune our approach, introducing a gap-based optimization and a novel twolevel threshold algorithm to improve efficiency. Our proposed methods support order-sensitive ETQ with a minor extension. Experiments on two datasets verify the efficiency and scalability of our proposed solution.

80.09.07, 12.30 – 13:30
Thursday 1 December
Alicia Ziying Yang
[TIGER] How Precise Does Document Scoring Need To Be?

We explore the implications of tied scores arising in the document similarity scoring regimes that are used when queries are processed in a retrieval engine. Our investigation has two parts: first, we evaluate past TREC runs to determine the prevalence and impact of tied scores, to understand the alternative treatments that might be used to handle them; and second, we explore the implications of what might be thought of as “deliberate” tied scores, in order to allow for faster search. In the first part of our investigation we show that while tied scores had the potential to be disruptive to TREC evaluations, in practice their effect was relatively minor. The second part of our exploration helps understand why that was so, and shows that quite marked levels of score rounding can be tolerated, without greatly affecting the ability to compare between systems. The latter finding offers the potential for approximate scoring regimes that provide faster query processing with little or no loss of effectiveness.

Xiaolu Lu
[TIGER] Modeling Relevance as a Function of Retrieval Rank

Batched evaluations in IR experiments are commonly built using relevance judgments formed over a sampled pool of documents. However, judgment coverage tends to be incomplete relative to the metrics being used to compute effectiveness, since collection size often makes it financially impractical to judge every document. As a result, a considerable body of work has arisen exploring the question of how to fairly compare systems in the face of unjudged documents. Here we consider the same problem from another perspective, and investigate the relationship between relevance likelihood and retrieval rank, seeking to identify plausible methods for estimating document relevance and hence computing an inferred gain. A range of models are fitted against two typical TREC datasets, and evaluated both in terms of their goodness of fit relative to the full set of known relevance judgments, and also in terms of their predictive ability when shallower initial pools are presumed, and extrapolated metric scores are computed based on models developed from those shallow pools.

80.09.07, 12.30 – 13:30
Thursday 10 November
Maria Maistro
[TIGER] Preliminary Exploration of User Behaviour in Job Search

This talk will present an initial exploration of user behaviour in job search using query and click logs from SEEK search engine. The observations suggest that the understanding of users’ search behaviour in this scenario is still at its infancy and that some of the assumptions made in general web search may not hold true. Moreover, we will introduce a user model model based on Markov chains and we will present a family of new evaluation measures, called Markov Precision (MP), which exploits the presented Markovian model in order to inject user models into precision.

80.09.07, 12.30 – 13:30
Thursday 27 October
Paper reading and discussion
[TIGER] Statistical Significance, Power, and Sample Sizes: A Systematic Review of SIGIR and TOIS, 2006-2015

Next week we’ll have a paper reading and discussion session:

Tetsuya Sakai. “Statistical Significance, Power, and Sample Sizes: A Systematic Review of SIGIR and TOIS, 2006-2015”. In Proc. SIGIR’16 (PDF attached).

Your entry ticket to this TIGER session is to read the paper, and be prepared to contribute to the discussion by coming with a comment or a question about the content.

80.09.07, 12.30 – 13:30
Thursday 20 October
TIGER/BDDA Faculty Meeting

In this week, we will have a TIGER/BDDA faculty meeting. During the meeting, Falk will lead something about the RMIT web site, and Mark will introduce a visitor Roi Blanco from Yahoo Research, and will discuss something about group supervision sessions, etc. Mark will bring donuts.

12.10.02, 12.30 – 13:30
Thursday 13 October
Johanne and Shafiza
[TIGER] Preliminary Exploration of User Behaviour in Job Search

Shafiza and Johanne will be be presenting a tutorial style introduction to the crowdsourcing platform CrowdFlower. They will be presenting an overview of who crowdworkers are, what style of work you can submit to CrowdFlower, and some important pointers for your job and analysis design.

This is an interactive TIGER talk and everyone is encouraged to bring their experiences and questions about working with CrowdFlower.

12.07.02, 12.30 – 13:30
Thursday 6 October
Bruce Croft (University of Massachusetts Amherst)
[TIGER] Neural Net Approaches to Information Retrieval

Retrieval tasks involve text at a variety of granularities, including documents, passages, and short answers. Considerable research has gone into developing retrieval approaches for each of these granularities. Given the success of neural net approaches in fields such as vision and NLP, researchers have started to investigate these approaches in IR. There have been some successes for small granularity text tasks, such as factoid retrieval, but progress in document retrieval tasks has been slower. In this talk, I will review some of the research in NN architectures, focusing on the work done in the CIIR.

W. Bruce Croft is a distinguished professor of computer science at the University of Massachusetts Amherst whose work focuses on information retrieval. He is the founder of the Center for Intelligent Information Retrieval and served as the editor-in-chief of ACM Transactions on Information Systems from 1995 to 2002. He was also a member of the National Research Council Computer Science and Telecommunications Board from 2000 to 2003. Since 2015, he is the Dean of the College of Information and Computer Sciences at the University of Massachusetts Amherst. He was Chair of the UMass Amherst Computer Science Department from 2001 to 2007.

12.07.02, 12.30 – 13:30

Meetings in 2016 Q3
(Jump to: 17-Q417-Q317-Q217-Q116-Q416-Q3.)

Thursday 29 September
Pengfei Li
[TIGER] On the Effectiveness of Query Weighting for Adapting Rank Learners to New Unlabelled Collections

Query-level instance weighting is a technique for unsupervised transfer ranking, which aims to train a ranker on a source collection so that it also performs effectively on a target collection, even if no judgement information exists for the latter. Past work has shown that this approach can be used to significantly improve effectiveness; in this work, the approach is re-examined on a wide set of publicly available L2R test collections with more advanced learning to rank algorithms. Different query-level weighting strategies are examined against two transfer ranking frameworks: AdaRank and a new weighted LambdaMART algorithm. Our experimental results show that the effectiveness of different weighting strategies, including those shown in past work, vary under different transferring environments. In particular, (i) Kullback-Leibler based density-ratio estimation tends to outperform a classification-based approach and (ii) aggregating document-level weights into query-level weights is likely superior to direct estimation using a query-level representation. The Nemenyi statistical test, applied across multiple datasets, indicates that most weighting transfer learning methods do not significantly outperform baselines, although there is potential for the further development of such techniques.

12.07.02, 12.30 – 13:30
Thursday 22 September
Lishan Cui
[BDDA] Topical Event Detection on Twitter

Event detection on Twitter has attracted active research. Although existing work considers the semantic topic structure of documents for event detection, the topic dynamics and the semantic consistency are under-investigated. In this paper, we study the problem of topical event detection in tweet streams. We define topical events as the bursty occurrences of semantically consistent topics. We decompose the problem of topical event detection into two components: (1) We address the issue of the semantic incoherence of the evolution of topics. We propose to improve topic modelling to filter out semantically inconsistent dynamic topics. (2) We propose to perform burst detection on the time series of dynamic topics to detect bursty occurrences. We apply our proposed techniques to the real world application by detecting topical events in public transport tweets. Experiments demonstrate that our approach can detect the newsworthy events with high success rate.

Liangjun Song
[BDDA] Joint Top-k Subscription Query Processing over Microblog Threads

With an increasing amount of social media messages, users on the social platforms start to seek ideas and opinions by themselves. Publisher subscribers are utilized by these who want to actively read and consume web data. Web platforms give people opportunities to communicate with others. The social property is also important in the pub/sub while currently no other works have ever considered this. Also, platforms like Twitter or Facebook only allow users to post a short message which causes the short-text problem: single posts lack of contextual information. Therefore, we propose the microblog thread as the minimum information unit in the subscription queries to capture social and textual relevant information. However, this brings challenges to this problem: 1. How to retrieve the microblog thread while the stream of microblogs keeps updating the microblog threads and the results of subscription queries keep changing? 2. How to represent the subscription results while the microblog threads are frequently updated? Hence, we propose the group filtering and individual filtering to help to satisfy the high update rate of subscription results. Extensive experiments on real datasets have been conducted to verify the efficiency and scalability of our proposed approach.

Yassien Shaalan
[BDDA] A Time and Opinion Quality-Weighted Model for Aggregating Online Reviews

Online reviews are playing important roles for the online shoppers to make buying decisions. However, reading all or most of the reviews is an overwhelming and time consuming task. Many online shopping websites provide aggregate scores for products to help consumers to make decisions. Averaging star ratings from all online reviews is widely used but is hardly effective for ranking products. Recent research proposed weighted aggregation models, where weighting heuristics include opinion polarities from mining review textual contents as well as distribution of star ratings. But the quality of opinions in reviews is largely ignored in existing aggregation models. In this paper we propose a novel review weighting model combining the information on the posting time and opinion quality of reviews. In particular, we make use of helpfulness votes for reviews from the online review communities to measure opinion quality. Our model generates aggregate scores to rank products. Extensive experiments on an Amazon dataset showed that our model ranked products in strong correspondence with customer purchase rank and outperformed several other approaches.

12.12.02, 12.30 – 13:30
Thursday 15 September
Xiuzhen (Jenny) Zhang
[BDDA] KRNN: k rare-class nearest neighbour classification

Imbalanced classification is a challenging problem. Re-sampling and cost-sensitive learning are global strategies for generality-oriented algorithms such as the decision tree, targeting inter-class imbalance. We research local strategies for the specificity-oriented learning algorithms like the k Nearest Neighbour (KNN) to address the within-class imbalance issue of positive data sparsity. We propose an algorithm k Rare-class Nearest Neighbour, or KRNN, by directly adjusting the induction bias of KNN. We propose to form dynamic query neighbourhoods, and to further adjust the positive posterior probability estimation to bias classification towards the rare class. We conducted extensive experiments on thirty real-world and artificial datasets to evaluate the performance of KRNN. Our experiments showed that KRNN significantly improved KNN for classification of the rare class, and often outperformed re-sampling and cost-sensitive learning strategies with generality-oriented base learners.

[TIGER] Automatic labelling of topics via analysis of user summaries

Topic models have been widely used to discover useful structures in large collections of documents. A challenge in applying topic models to any text analysis task is to meaningfully label the discovered topics so that users can interpret them. In existing studies, words and bi-gram phrases extracted internally from documents are used as candidate labels but are not always understandable to humans. In this paper, we propose a novel approach to extracting words and meaningful phrases from external user generated summaries as candidate labels and then rank them via the Kullback-Leibler semantic distance metric. We further apply our approach to analyse an Australian healthcare discussion forum. User study results show that our proposed approach produces meaningful labels for topics and outperforms state-of-the-art approaches to labelling topics.

12.12.02, 12.30 – 13:30
Thursday 8 September
Horacio Saggion (Universitat Pompeu Fabra, Barcelona)
[TIGER] An Overview of Scientific Text Mining and Summarization Research at TALN

During the last decade the amount of scientific information available on-line increased at an unprecedented rate, with recent estimates reporting that a new paper is published every 20 seconds. In this scenario of scientific information overload, researchers are overwhelmed by an enormous and continuously growing number of articles to access in their daily activities. The exploration of recent advances concerning specific topics, methods and techniques, peer reviewing, the writing and evaluation of research proposals and in general any activity that requires a careful and comprehensive assessment of scientific literature has turned into an extremely complex and time-consuming task. The availability of text mining tools able to extract, aggregate, summarize and turn scientific unstructured textual contents into well organized and interconnected knowledge is fundamental in a scientific information access scenario. In this presentation, I will first describe work carried out in our Natural Language Processing Research Group (TALN) in several areas to then provide an overview of our current work on scientific text mining and summarization.

Horacio Saggion is associate professor at the Department of Information and Communication Technologies, Universitat Pompeu Fabra, Barcelona. He joined the department as a Ramon y Cajal Research Fellow in 2010. He is associated to the Natural Language Processing group where he works on automatic text summarization, text simplification, information extraction, sentiment analysis and related topics. He obtained his PhD diploma in Computer Science from University of Montreal, Canada in 2000. His research is empirical combining symbolic, pattern-based approaches and statistical and machine learning techniques. Before joining Universitat Pompeu Fabra, he worked at the University of Sheffield for over nine developing cutting-edge competitive human language technology. He was also an invited researcher at Johns Hopkins University, USA. He is currently principal investigator for UPF in the EU projects Dr Inventor and Able-to-Include and for the national project TUNER. Horacio has published over 100 works in leading scientific journals, conferences, and books in the field of human language technology. He organized four international workshops in the areas of text summarization and information extraction and was scientific Co-chair of STIL 2009 and scientific Chair of SEPLN 2014. He is co-editor of a book on multilingual, multisource information extraction and summarization published by Springer (2013).

12.07.02, 12.30 – 13:30
Thursday 1 September
Jessie Nghiem
[TIGER] Spatial Textual Top-k Search in Mobile Peer-to-Peer Networks

Mobile hardware and software is quickly becoming the dominant computing model for technologically savvy people around the world. Nowadays, mobile devices are commonly equipped with GPS and wireless connections. Users have also developed the habit of regularly checking into a location, and adding comments or ratings for restaurants or any place of interest visited. This work explores new approaches to make data available from a local network, and to build a collaborative search application that can suggest locations of interest based on distance, user reviews and ratings. The proposed system includes light-weight indexing to support distributed search over spatio-textual data on mobile devices, and a ranking function to score objects of interest with relevant user review content. From our experimental study using a Yelp dataset, we found that our proposed system provides substantial efficiency gains when compared with a centralised system, with little loss in overall effectiveness. We also present a methodology to quantify efficiency and effectiveness trade-offs in decentralized search systems using the Rank-based overlap (RBO) measure.

Xiaolu Lu
[TIGER] Efficient and Effective Higher Order Proximity Modeling

Bag-of-words retrieval models are widely used, and provide a robust trade-off between efficiency and effectiveness. These models often make simplifying assumptions about relations between query terms, and treat term statistics independently. However, query terms are rarely independent, and previous work has repeatedly shown that term dependencies can be critical to improving the effectiveness of ranked retrieval results. Among all term-dependency models, the Markov Random Field (MRF) [Metzler and Croft, SIGIR, 2005] model has received the most attention in recent years. Despite clear effectiveness improvements, these models are not deployed in performance-critical applications because of the potentially high computational costs. As a result, bigram models are generally considered to be the best compromise between full term dependence, and term-independent models such as BM25.

Here we provide further evidence that term-dependency features not captured by bag-of-words models can reliably improve retrieval effectiveness. We also present a variation on the highly-effective MRF model that relies on a BM25-derived potential. The benefit of this approach is that it is built from feature functions which require no higher-order global statistics. We empirically show that our new model reduces retrieval costs by up to 60%, with no loss in effectiveness compared to previous approaches.

12.10.02, 12.30 – 13:30
Thursday 25 August
Wei Shao
[BDDA] Clustering Big Spatiotemporal-Interval Data

We propose a model for clustering data with spatiotemporal intervals, which is a type of spatiotemporal data associated with a start- and an end-point. This model can be used to effectively evaluate clusters of spatiotemporal interval data, which signifies an event at a particular location that stretches over a period of time. Our work aims to deal with evaluating the results of clustering in multiple Euclidean spaces. This is different from traditional clustering that measure results in single Euclidean space. A new energy function is proposed that measures similarity and balance between clusters in spatial, temporal, and data dimensions. A large collection of parking data from a real CBD area is used as a case study. The proposed model is applied to existing traditional algorithms to solve spatiotemporal interval data clustering problem. Using the proposed energy function, the results of traditional clustering algorithms are compared and analysed.

12.13.02, 12.30 – 13:30