WhatsApp Chat Sentiment Analysis in R

Introduction to Natural Language Processing | NLP

What is Natural Language Processing

The process of focusing on the interactions between human language and computers is called Natural Language Processing (NLP). We generally use NLP in analysing the large documents sets. It is even used by developers to organise and structure knowledge to perform tasks as automatic summarisation, translation.

NLP is majorly used in analyzing the text so that it helps machines in understanding how human’s speak.

NLP libraries help in providing the blocks for real-world applications. It is the combination of computer science, artificial intelligence, and computational linguistics. Natural Language Processing covers computer understanding and manipulation of human language. NLP considers the hierarchical structure of language except for common word processor operations. NLP systems have useful roles, like correcting grammar, converting speech to text and automatically translating between languages. The human-computer interaction in NLP used in real-world applications like automatic text summarisation, sentiment analysis, topic extraction, named entity recognition, parts-of-speech tagging, relationship extraction, stemming.

NLP is commonly used for text mining, machine translation, and automated question answering. NLP algorithms are based on machine learning algorithms. NLP rely on machine learning to automatically learn rules by analysing a set of examples.

One can build a machine learning RSS reader using the algorithms like scrapers, AutoTag, Html2Text, Summariser, Sentiment Analysis. You should have an idea about many concepts of NLP like Regular expressions & word tokenisation, Simple topic identification, Named-entity recognition, Building a fake news classifier.

Regular expressions & word tokenisation: This help to parse text. You'll also learn to handle non-English text and more difficult tokenisation. By this, you might find as you explore the wide world of NLP.

Simple topic identification: This will introduce you to the topic identification, which you can apply to any text you encounter in the world. We can identify topics from texts, based on term frequencies using basic NLP models.

Named-entity recognition: Through this, you'll learn how to identify them, who, what and where of your texts using pre-trained models on English and non-English text. You'll also learn how to use some new libraries - polyglot and spaCy.

Building a fake news classifier: By this, you will be able to build a fake news detector. You'll learn the basics of supervised machine learning and then move forward by choosing a few important features and testing ideas to identify and classify fake news articles.

We can make intelligent systems like a robot to perform according to your instructions when you want to hear a decision as the process of Natural Language. NLP involves making computers to perform useful tasks with the natural languages humans use. The input and output of an NLP system are

• Speech

• Written Text

There are two components of NLP.

• Natural Language Understanding (NLU)

• Natural Language Generation (NLG)

Natural Language Understanding (NLU):

This majorly involves in understanding the concepts like

  • Mapping the given input in natural language into useful representations.
  • Analysing different aspects of the language.

Natural Language Generation (NLG):

  • It is the process of producing meaningful phrases and sentences.
  • It involves Text planning, Sentence planning, Text Realisation

When compared to NLG, NLU is harder. The difficulties in NLU include its rich form, structure, and ambiguity. Different levels of ambiguity are Lexical ambiguity, Syntax Level ambiguity, Referential ambiguity.

NLP Terminology

Phonology − Study of organizing sound systematically.

Morphology −Study of construction of words from primitive meaningful units.

Morpheme − It is a primitive unit of meaning in a language.

Syntax − It refers to arranging words to make a sentence. It also involves determining the structural role of words in the sentence and in phrases.

Semantics − It is concerned with the meaning of words and how to combine words into meaningful phrases and sentences.

Pragmatics − It deals with using and understanding sentences in different situations and how the interpretation of the sentence is affected.

Discourse − It deals with how the immediately preceding sentence can affect the interpretation of the next sentence.

World Knowledge − It includes the general knowledge about the world.