Introduction
Single and multi-document summarizer system, The Ijaz project has been done ordered by Iran’s information technology institution by Ferdowsi University’s web technology laboratory. Also, the web-based version of single and multi-document summarizer has been produced which can be found in home page. This system can be used to produce summery for single or multi-documents written in Persian or English. To create this system, many criterions have been used.
Also, for the first time in Iran, a big corpus of summarization in Persian language for evaluating summarization systems using necessary standards has been produced by spending more than 2000 hour/person. The “Paasokh” corpus (standard summarization system’s corpus) is offered in two types: single document and multi-documents. Single document corpus consists of 100 different news subjects chosen from Iran’s popular news agencies. Each of these subjects has 5 summarized abstraction which are created by trained experts. Paasokh’s multi-document corpus also has 50 subjects, each consist of 20 documents and 5 expert-generated abstract summarizations.
Also for the first time in country, tools for evaluating summarization system has been produced. This tool, using different criterions and benefiting from human-generated summaries in Paasok corpus, can evaluate summarization systems. This tool can be downloaded in tools section of the web site. Other tools for pre-processing natural language has also been developed and can be downloaded form site.
Project Links
Project Members
- Ehsan Asgarian
- Ahmad Estiri
- Fatemeh Pourgholamali
- Asef Pourmasoumi
- Hadi Qaemi
- Reza Saeedi
- Seyed Ahmad Tousi
Reseach Areas
Run Date
2012
Theses
- Concept-based Automatic Multi-document Summarization, Asef Pourmasoumi, , Ph.D thesis
- A novel method for semantic weighting in text processing use cases, Hossein Kamyar, , M.S. thesis
- Abstractive summarization based on sentences’ similarities, Fatemeh Pourgholamali, , Ph.D thesis
- معنا گرایی در ارزیابی خودکار خلاصه سازهای ماشینی انگلیسی و فارسی با بهره گیری از شبکه واژگان, Ahmad Estiri, , M.S. thesis