What is Natural Language Processing?

what is natural language procesing, challenge of nlp, how does natural language processing work, phases of nlp

natual language processing

INTRODUCTION TO NATURAL LANGUAGE PROCESSING

What is Natural Language? Any language in which humans are making a conversation that is natural language. Language is at the core of human evolution. It is the most fundamental form of our intelligence and our ability to communicate through language is truly spectacular. What if computers could understand and converse in languages just like humans?

Natural Language Processing is a computer program built for reading, understand the human language and respond the Text and Voice Data. It is considered as a subset of machine learning while NLP and ML both fall under the larger category of artificial intelligence that can analyze, understand, and manipulate human language. It is the understanding and generation of written and spoken natural language using advanced software. The technology leverages a wide range of disciplines and artificial intelligence like machine learning and computational linguistics. This means when you write or speak phrases sentences or even longer content natural language processing can understand based upon grammatical rules of the specific language.

Natural Language Processing

Natural Language Processing Terminologies:

This technique is used to clean the text data for the machine to be able to analyze it.

Tokenization

Tokenization is the task in Natural Language processing, Tokenization is breaking the raw text into words or sentences called a token.

Sequencing

Creating sequences of numbers from the sentences and using tools to process to make them ready to teach neural networks.

Normalization

Normalization is used to put all words on equal footing and allows processing to proceed uniformly, converting all words in the same case (upper or lower) and finding valuable output.

Corpus

Corpus is a language that consists of text collection or a set of text. Corpora are used for statistical linguistic analysis and hypothesis testing.

Bag of Words

It is a text representation that describes the word count in a document. Sample text

                   “Hello, hello, hello,” said Josh

                   “Here, here,” said John. “Here, here,”

The resulting bag of word representation as a dictionary:


‘hello’:3,

‘said’:2,

‘Josh’:1,

‘here’:4,

‘John’:1

}

N-gram

N-gram model that predicts the probability of a given sequence of words in a sentence. An n-gram model is a language model to predict the next item in the form of an (n-1) order.


What are the techniques used in NLP?

Two components of the NLP system are:

Natural language understanding (NLU)

It is also called natural language interpretation (NLI) (i.e. human to machine). The mapping of a given input in natural language into useful representation. Analyzing different aspects of the language.

Natural language generation (NLG)

It is a software process that generates meaningful sentences and phrases in the form of natural language output. This process involves text plan, sentence plan, and text realization.

Example: Automated journalism


Difficulties in Natural Language Processing

Lexical ambiguity

It is predefined at a very primitive level such as word level.

Syntax level ambiguity

It defines a sentence in a parsed way or in a different way.

Referential ambiguity

Very often a text mentions an entity (Someone/something) and then refers to it again possible in a different sentence using another word. Refer something using a pronoun.


Phases of Natural Language Processing

Lexical (structure) analysis

It is a process of finding and analyze the structure of words. The collection of words and phrases in the language is the lexicon of a language.

Syntactic analysis (parsing)

Parsing for the analysis of the word using formal grammar. It can arrange words in a particular manner. That shows the relationship between words.

Examples:

LemmatizationIt is the most common text preprocessing technique used in NLP and ML.

For example, stemming the word fails to return its citation form; however, lemmatization would result in the following:

Better to good

Stemming: Stemming refers to the method of reducing a derivational word into its stem that is attached to suffixes and prefixes.

Running to run

Part of Speech (POS) tagging: It is also called grammatical tagging is the process which refers to categorizing words in a text in accordance with a particular part of speech such as nouns, adjectives, verbs, adverbs, etc., depending on the meaning of the word and its context.

Semantic analysis

Semantic data analysis is the process of identifying the meaning and tone in unstructured text. In the task domain, mapping syntactic structure and object.

Examples:

Named entity recognition (NER): Categorize the words into groups

Word sense disambiguation: This refers to the meaning of a word based on context.

Discourse integration

In this step, the meaning of any sentence depends upon the meaning of a sense of the context. It also brings meaning to immediately the preceding sentence.

Pragmatic analysis

It is the process of extracting information from text and data is integrated into what is actually meant.

Phases of nlp

How does natural language processing work?

Segmentation is to break the entire document down into its constituent sentences. Segmenting the article along with its punctuations like full stops and commas.

Sentence segmentation
For the algorithm to understand these sentences we get the words in a sentence and explain them individually to our algorithm.

Word tokenization
So we break down our sentence into its constituent words and store them. This is called tokenizing where each word is called a token. Make the learning process faster by getting rid of non-essential words which do not add much meaning to our statement and are just there to make our statement sound more cohesive. 

Identifying stop words
These words such as ‘as’, ‘are’, and ‘the’ are called stop words. So the unique words in the text remain. For instance, “the”, “and”, and “a” while all required words in a particular passage, it does not contribute much to understanding of content. The basic form of our document we need to explain to our machine. First, start off by explaining that some words like

Stemming in NLP
are the same word with added prefixes and suffixes this is called stemmingIdentify the base words for different word tenses, mood, gender, etc. This is called lemmatization stemming from the base word lemma.

Lemmatization
Explain the concept of nouns, verbs, articles, and other parts of speech to the machine by adding these tags to our words this is called part of speech tagging.

Speech tagging
Introduce our machine to pop culture references and everyday names by flagging names of movies, important personalities or locations, etc. That may occur in the document. This is called named entity tagging.

Named entity tagging
Once we have our base words and tags use a machine learning algorithm like Naïve-Bayes to teach our model humans sentiment and speech. Most of the techniques used in NLP are simple grammar techniques.

 

Challenges of NLP

Natural language processing has the potential to have significant social benefits. These technologies are rapidly advancing however, they face many challenges here are a couple of data.

Natural language processing analyzes vast amounts of data to extract a particular piece of information. To function, effectively NLP models have trained with a corpus curated data set. However, finding the right or relevant answer is challenging because of the enormous complexity of machine learning algorithms that examine millions of unstructured and semi-structured data sets.

In human language, we often use the same vocabulary in different contextual meetings. Natural language processing algorithms are not yet fully competent to distinguish between contextual human languages. The same challenge exists with ambiguity and homonyms where NLP has to make a guess. However, as more data is captured and the technology learns the model will improve slang and colloquialism. Formal language has rarely changed rules and forms this means that using natural language processing for a wide variety of applications is more challenging because the data needed to train the model becomes larger, evolving, and more unstructured this is a significant challenge. However, advancements in technology are showing signs that this problem will soon be overcome.

 Related Post: Applications of Natural Language Processing 2021

COMMENTS

Name

Artificial Intelligence,16,AWS,2,Big Data,2,Blockchain Technology,3,Cloud computing,9,Computer networks,4,cybersecurity,3,Database,2,Digital Marketing,1,E-commerce,2,Internet of Things,12,Quantum Computing,1,RPA,3,Technology,5,Web Scraping,1,
ltr
item
PS TECHNO BLOG: What is Natural Language Processing?
What is Natural Language Processing?
what is natural language procesing, challenge of nlp, how does natural language processing work, phases of nlp
https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgxoYgJGcxi8ND35s6BYQ0UEulPyNJ-DYgsTRCT1asySM0S5fajFlgH_OD0rEILe9yr_yw32N7e9z0AWVgHWZB0ZKRW4kuf1-xZB89TaAaCXqfG5HQMAZr6X862EZyWOLxDVusW_QaNvF5h/s320/C+%252815%2529.png
https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgxoYgJGcxi8ND35s6BYQ0UEulPyNJ-DYgsTRCT1asySM0S5fajFlgH_OD0rEILe9yr_yw32N7e9z0AWVgHWZB0ZKRW4kuf1-xZB89TaAaCXqfG5HQMAZr6X862EZyWOLxDVusW_QaNvF5h/s72-c/C+%252815%2529.png
PS TECHNO BLOG
https://pstechnoblog.blogspot.com/2021/08/what-is-natural-language-processing.html
https://pstechnoblog.blogspot.com/
https://pstechnoblog.blogspot.com/
https://pstechnoblog.blogspot.com/2021/08/what-is-natural-language-processing.html
true
8274238297581439573
UTF-8
Loaded All Posts Not found any posts VIEW ALL Readmore Reply Cancel reply Delete By Home PAGES POSTS View All RECOMMENDED FOR YOU LABEL ARCHIVE SEARCH ALL POSTS Not found any post match with your request Back Home Sunday Monday Tuesday Wednesday Thursday Friday Saturday Sun Mon Tue Wed Thu Fri Sat January February March April May June July August September October November December Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec just now 1 minute ago $$1$$ minutes ago 1 hour ago $$1$$ hours ago Yesterday $$1$$ days ago $$1$$ weeks ago more than 5 weeks ago Followers Follow THIS PREMIUM CONTENT IS LOCKED STEP 1: Share to a social network STEP 2: Click the link on your social network Copy All Code Select All Code All codes were copied to your clipboard Can not copy the codes / texts, please press [CTRL]+[C] (or CMD+C with Mac) to copy