JTAGGER

Part-of-speech tagging, also called grammatical tagging, is the process of assigning the words in a text with their corresponding parts of speech like noun, verb, pronoun, or other lexical class markers to each word in a sentence. Part-of-speech tagging is an important step in natural language pr...

Full description

Main Author: YAACOB, NORHANA
Format: Final Year Project
Language: English
Institution: Universiti Teknologi Petronas
Record Id / ISBN-0: utp-utpedia.7056 /
Published: Universiti Teknologi Petronas 2006
Subjects:
Online Access: http://utpedia.utp.edu.my/7056/1/2006%20-%20JTAGGER.pdf
http://utpedia.utp.edu.my/7056/
Tags: Add Tag
No Tags, Be the first to tag this record!
Summary: Part-of-speech tagging, also called grammatical tagging, is the process of assigning the words in a text with their corresponding parts of speech like noun, verb, pronoun, or other lexical class markers to each word in a sentence. Part-of-speech tagging is an important step in natural language processing. Part-of-speech tagging is an ambiguous process because a word can represent morethan one part of speech at different times. Most difficult task is because it deals with ambiguities of the word. A word, phrase, or sentence is ambiguous if it has more than one meaning. The word 'light', for example, can mean not very heavy or not very dark. There are two types of ambiguity which are lexical and structural. When a word has more than one meaning, it is said to be lexically ambiguous. When a phrase or sentence can have more than one structure it is said to be structurally ambiguous. The part-of-speech tagging algorithms fall into three classes which are rule-based taggers, stochastic taggers, and transformation-based taggers. In this project, rule-based tagging algorithm is used as the mechanism to develop the system which named JTagger. The tagger initially tags by assigning each word its most likely tag, estimated by examining a corpus that consists of Penn Treebank Tagsets. JTagger is automatically performed the tagging process giving reasonable accuracy thus eliminate the difficulties of hand tagging task for the reader to manually tag a sentence. Part-of-speech tagging is important since it could help people to understand English better. The programming language used in this system is Java because it is an independent source that can run in any platform including Microsoft or UNIX.