Summarization

Background
Automatic summarization is the process of reducing a text document with a computer program in order to create a summary that retains the most important points of the original document. As the problem of information overload has grown, and as the quantity of data has increased, so has interest in automatic summarization. The major challenge of summarization process is to accurately decide the communicational intent of the original text and to formulate a summary which won't lose the actual essence of original corpus, and to present it in a human readable form. If the document are written in more than one language, summarization process needs to tackle the problems due to translational errors. As far the the automatic summarization of user-generated content is concerned all the challenges of information extraction in the noisy data will apply in the summarization process too.

Research Areas
Currently, we are working in the following areas in summarisation:

Multilingual Multi document summarization: Creating a monolingual summary for a corpus which contains documents written in more than one language.
Summarization of Online Conversations:The emergence of microblogs has grabbed the attention of researchers in this area. The ability to summarise conversations will provide interesting insights when it comes to harnessing the opinion of the mass.
Update summarization: Update summarization is an emerging summarization task of creating a short summary of a set of articles, under the assumption that the user has already read a given set of earlier articles
Blog summarization: With the ever increasing number of blogs written everyday, the amount of user-generated content is also increasing. A medium is needed to harness this content into meaningful information. Blog summarisation is a technique employed to solve this problemn of information overload