Skip to content

Big Data in Action: a customer sentiment prediction initiative using PySpark and Databricks, tackling over 49,000 e-commerce orders and reviews.

Notifications You must be signed in to change notification settings

gaurav1nemani/Customer-Sentiment-Prediction-PySpark

Repository files navigation

Customer Sentiment Prediction: Pyspark

Big Data in Action: a customer sentiment prediction initiative using PySpark and Databricks, tackling over 49,000 e-commerce orders and reviews.

The goal was to understand the drivers behind positive and negative customer reviews by engineering 261 features from product, order, payment, and shipping data. Using ensemble ML methods and combining Random Forest and Gradient Boosting models, we achieved high accuracy over 0.86, efficiently predicting customer sentiments.

Beyond the model, the project delivered actionable insights:

Product description and shipping transparency emerged as key influencers of positive reviews.

Higher product weights and higher shipping costs were linked to negative experiences.

Payment methods and order values revealed behavioral patterns that informed retention strategies.

About

Big Data in Action: a customer sentiment prediction initiative using PySpark and Databricks, tackling over 49,000 e-commerce orders and reviews.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages