Real Time Twitter Sentiment Analysis Using Spark ,Twitter Streaming API and write to Elastic Serach
In this tutorial, I am going to show you how you can use Apache Spark and Twitter Streaming API to get the twitter feed in real time and how you can apply sentiment analysis on the fly. After that we will save that prediction in elastic search, so we can create visualization in Kibana
Prerequisites:
- Understanding of Apache Spark
- Understanding of some natural language processing
This blog is :
- Not about improving the accuracy of prediction
- Not about teaching spark
Let's start our tutorial. I will be doing Apache spark coding in scala programming language. We will use some jar dependencies that you have to download from Mavon.
Dependencies:
- twitter4j core
- spark streaming
- spark core
- spark elasticsearch
Step 1: Create a Twitter Account
- Go to My Apps Tab and create a new twitter app or you can use your existing app.
- Click on the app and go to "keys and access tokens" tab. We will come back to this place later to get the information.
Step 2: Create a Scala Project in Eclipse
I am going to use Eclipse as an IDE for scala, you can use any IDE you want.
I am going to use Eclipse as an IDE for scala, you can use any IDE you want.
- Create a scala project in your eclipse IDE. If you don't know how to create scala project in eclipse than follow this link.
- Now create a scala object you can name it whatever you want.
Let's start writing spark program to fetch twitter streaming data.
Import all the libraries and packages necessary to perform this task.
Import all the libraries and packages necessary to perform this task.
I am interested in english tweets, so let's filter English tweets. Now we have twitter streaming data, we need to extract important data for us. You can extract whatever you want from twitter data. We will be extracting lots of features from this data.
Above code will create rows of all the data we need and then we are declaring the types of those fields to create a data frame. In this blog, We will not learn how to create this sentiment models. Here I am just loading the models that i have created already.
In this code, we have processed tweets from twitter and converted into appropriate input format for sentiment analysis model and then predicted the sentiment score. We are creating a new dataframe with all these information and will write in elastic search.
Let's Start the streaming.
Comments
Post a Comment