Use the same data (that we obtained by in source code Data=pd.read_csv('https://raw.githubusercontent.com/dD2405/Twitter_Sentiment_Analysis/master/train.csv')) and perform the sentiment analysis task on this data using one of the scikit learnclassifier for text.
ICP Requirements:
1)Data cleaning and preprocessing (at minimum have the following: Removing unnecessary columns or data, Removing Twitter Handles( @user ), Removing punctuation, numbers, special characters, removingstop words, Tokenization, and Stemming, TFIDF vectors, POS tagging, checking for missing values ,train/test split of data).
2)Data Visualization and analysis for critical steps (WordCloud, Bar plots, etc)
3)Model building and successfully executing the model to make prediction.
4)Code quality, WikiReport quality, video explanation
Technology to use: Spark