Parsi Pardaz

To use the software, first install the .NET Framework 4.5 software package. The use of this program in scientific researches with reference to the web technology laboratory of Ferdowsi University of Mashhad is not prohibited. If you use this tool in your research work, please refer to this tool as follows:
Persian language text processing tools, Web Technology Laboratory, Ferdowsi University of Mashhad, 2013. (wtlab.um.ac.ir)
Also, if you see a mistake in recognizing the root of the word or its label, please notify the error to ehsan.asgarian@gmail.com.

Labeling the role of Persian words Semantic root finder tool for Persian language version 1.6 Persian language parser tool

More details

POS TAGGER

Part of Speech tagging is the act of assigning lexical tags to the words and signs that make up a text; In such a way that these labels show the role of words and signs in the sentence. A high percentage of words are ambiguous from the point of view of lexical tags, because words have different lexical tags in different positions. Therefore, lexical tagging is the act of disambiguating tags according to the desired context (text). Lexical tagging is a fundamental practice for many other areas of natural language processing (NLP), such as machine translation, error detection, and text-to-speech. So far, many models and methods have been used for tagging in different languages. Some of these methods are:
Markov Hidden Model
Transformation/Rule-based tagger
Memory-based systems
Maximum Entropy Systems

STEMMER

The purpose of carrying out the semantic rooting project in Persian language is to separate the words from the text and return the words to their original root. The main difference between this project and other researches in the field of root search is the ability to return words to their roots without losing their meaning in the sentence. For this purpose, special attention has been paid to the role of words in the sentence. In this project, the collection of verbs compiled by the Dadegan group and the frequently used words of Hamshahri corpus are used.

PARSER

Parallel to the progress and theoretical developments in new linguistics, the methods of analyzing texts and grammar by computer have also evolved. The meaning of the grammar of any language is to have a series of language commands that can be understood by the computer, with the help of which the syntactic components of a sentence can be correctly separated. Analyzing the sentence and breaking it into components such as nominal, present, adverbial groups, etc., is done by a tool called a parser, which plays an essential role in designing or increasing the accuracy of other text processing tools.
The parser designed for the Persian language in this project forms a syntactic tree or parsing for text sentences from the structure of words, the position and order of words in the sentence, the letters or phrases before and after them and the type of words. In fact, the parsing operation is done according to morphology (studying the structure and different states of a word) as well as the grammatical commands of the Persian language. It is obvious that the more the writing used in the sentences and the observance of punctuation marks are done according to the principles and with more accuracy, the parsing operation will be done with better quality and the components of the sentence will be labeled with fewer and simpler operations.

Research Field

NLP

Implementation Date

2013

Project Members

Ahmad Estiri

Seyyed Mohammad Asghari Nekah

Reza Saeedi

Seyyed Ahmad Toosi

Ehsan Askarian

Hadi Ghaemi