Cover Image

Parsing Algorithm for English to Indian Language Machine Translation System using Tree Adjoining Grammar Formalism

Lina Nath, Apeksha Joshi

Abstract


Abstract- Parser is a very crucial part of Natural Language Processing (NLP) through computational languages for syntactic analysis of sentence which includes identification of part of speech, phrase in terms of grammatical constituent, syntactic relation between phrases etc. A parser in machine translation analyzes sentences based on grammar formalism and builds a data structure corresponding to that grammar which indicates the syntactic correctness of a sentence. There are different formalisms available that can be used for analyzing sentences in natural language processing among which Tree Adjoining Grammar (TAG) formalism seems to be the most promising formalism for Machine Translation domain analysis, and here in this regard English to Indian languages. A hybrid approach to parsing algorithm known as Earley-type TAG parser gives better performance which can be further increased by using constraints in the operations of TAG formalism, reduction during the lexical analysis and applying heuristic rules during implementation.

This paper proposes and ideate on the basic concept of TAG formalism i.e. the trees used for analyzing the sentences, the operations via which the trees are parsed, the recognizer algorithm, the use of TAG formalism for building an Earley-type parser and the constraints and heuristics to improve the performance of Earley-type TAG parser for Machine Translation technology that are elaborated subsequently.

 


Full Text:

PDF

References


Yves Schabes and Aravind K. Joshi, "AN EARLEY-TYPE PARSING ALGORITHM FOR TREE ADJOINING GRAMMARS", In Proceeding: 26th Annual Meeting of the Association for Computational Linguistics, 7-10 June 1988, PP.258-269.

K. R. Chowdhary, ”Natural Language Processing”, M.B.M. Engineering College, Jodhpur, India, April 29, 2012

A. V. AHO and S. C. JOHNSON , “LR Parsing”, Bell Laboratories, Murray Hzll, New Jersey 07974, Computing Surveys, Vol 6, No 2, June 1974

Maggie Johnson and revised by Julie Zelenski, “Bottom-Up Parsing”, CS143 Summer 2008, Handout 08, July 02, 2007

Brian Roark, “Probabilistic Top-Down Parsing and Language Modeling “, Computational Linguistics, Volume 27, Number 1, March 2001, In Proceeding: Department of Cognitive and Linguistic Sciences, Box 1978, Brown University, Providence, RI 02912

Ruifang Ge Raymond J. Mooney, “A Statistical Semantic Parser that Integrates Syntax and Semantics”, In Proceeding: the Ninth Conference on Computational Natural Language Learning, Ann Arbor, MI, PP. 9-16, June 2005.

Sneha Tripathi and Juran Krishna sarkhel, “Approaches to machine translation”, Annals of Library and Information Studies, Vol. 57, December 2010, pp. 388-393

]Yves Schabes, Aravind K. Joshiy, "Parsing with Lexicalized Tree Adjoining Grammar", Department of Computer and Information Science School of Engineering and Applied Science University of Pennsylvania Philadelphia, Department of Computer & Information Science Technical Reports (CIS)

Karin Christine Kipper & Vera Lficia Strube de Lima, "Portuguese Analysis with Tree Adjoining Grammars", PUCRS - Institute de lnform'atica, Proceeding, The 15th International Conference on Computational Linguistics, August 5 - 9, 1994, PP.1255-1261.

]Yves Schabes and Aravind K. Joshi, "Tree Adjoining Grammar", In Arto Saolmaa and Gzegorz Rosenberg, eds., Handbook of Formal Languages and Automata

Joshi, Aravind K., 1987. An Introduction to Tree Adjoining Grammars. In Manaster-Ramer, A. (editor), Mathematics of Language. John Benjamins, Amsterdam.

Joshi, A. K.; Levy, L. S.; and Takahashi, M., 1975. Tree Adjunct GraJnmars. J. Comput. Syst. Sci. 10(1).

Kroch, A. and Joshi, A. K., 1985. Linguistic Relevance of Tree Adjoining Grammars. Technical Report MS- CIS-85-18, Department of Computer and Information Science, University of Pennsylvaain.


Refbacks

  • There are currently no refbacks.

Comments on this article

View all comments


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.