Project [CASE WHEN ((rating#2 >= 0.0) AND (rating#2 < 1.0)) THEN Very Low WHEN ((rating#2 >= 1.0) AND (rating#2 <= 2.0)) THEN Low WHEN ((rating#2 >= 3.0) AND (rating#2 <= 4.0)) THEN Medium WHEN ((rating#2 > 4.0) AND (rating#2 <= 5.0)) THEN High END AS rating_category#15]
== Parsed Logical Plan ==
GlobalLimit 21
+- LocalLimit 21
+- Project [cast(rating_category#15 as string) AS rating_category#59, cast(rating_count#54L as string) AS rating_count#60]
+- Aggregate [rating_category#15], [rating_category#15, count(1) AS rating_count#54L]
+- Project [userId#0L, movieId#1L, rating#2, timestamp_str#3L, date#8, CASE WHEN ((rating#2 >= cast(0 as double)) AND (rating#2 < cast(1 as double))) THEN Very Low WHEN ((rating#2 >= cast(1 as double)) AND (rating#2 <= cast(2 as double))) THEN Low WHEN ((rating#2 >= cast(3 as double)) AND (rating#2 <= cast(4 as double))) THEN Medium WHEN ((rating#2 > cast(4 as double)) AND (rating#2 <= cast(5 as double))) THEN High END AS rating_category#15]
+- Sort [date#8 ASC NULLS FIRST], true
+- Project [userId#0L, movieId#1L, rating#2, timestamp_str#3L, from_unixtime(timestamp_str#3L, yyyy-MM-dd, Some(GMT)) AS date#8]
+- LogicalRDD [userId#0L, movieId#1L, rating#2, timestamp_str#3L], false
== Analyzed Logical Plan ==
rating_category: string, rating_count: string
GlobalLimit 21
+- LocalLimit 21
+- Project [cast(rating_category#15 as string) AS rating_category#59, cast(rating_count#54L as string) AS rating_count#60]
+- Aggregate [rating_category#15], [rating_category#15, count(1) AS rating_count#54L]
+- Project [userId#0L, movieId#1L, rating#2, timestamp_str#3L, date#8, CASE WHEN ((rating#2 >= cast(0 as double)) AND (rating#2 < cast(1 as double))) THEN Very Low WHEN ((rating#2 >= cast(1 as double)) AND (rating#2 <= cast(2 as double))) THEN Low WHEN ((rating#2 >= cast(3 as double)) AND (rating#2 <= cast(4 as double))) THEN Medium WHEN ((rating#2 > cast(4 as double)) AND (rating#2 <= cast(5 as double))) THEN High END AS rating_category#15]
+- Sort [date#8 ASC NULLS FIRST], true
+- Project [userId#0L, movieId#1L, rating#2, timestamp_str#3L, from_unixtime(timestamp_str#3L, yyyy-MM-dd, Some(GMT)) AS date#8]
+- LogicalRDD [userId#0L, movieId#1L, rating#2, timestamp_str#3L], false
== Optimized Logical Plan ==
GlobalLimit 21
+- LocalLimit 21
+- Aggregate [rating_category#15], [rating_category#15, cast(count(1) as string) AS rating_count#60]
+- Project [CASE WHEN ((rating#2 >= 0.0) AND (rating#2 < 1.0)) THEN Very Low WHEN ((rating#2 >= 1.0) AND (rating#2 <= 2.0)) THEN Low WHEN ((rating#2 >= 3.0) AND (rating#2 <= 4.0)) THEN Medium WHEN ((rating#2 > 4.0) AND (rating#2 <= 5.0)) THEN High END AS rating_category#15]
+- LogicalRDD [userId#0L, movieId#1L, rating#2, timestamp_str#3L], false
== Physical Plan ==
CollectLimit 21
+- *(2) HashAggregate(keys=[rating_category#15], functions=[count(1)], output=[rating_category#15, rating_count#60])
+- Exchange hashpartitioning(rating_category#15, 200), true, [id=#79]
+- *(1) HashAggregate(keys=[rating_category#15], functions=[partial_count(1)], output=[rating_category#15, count#64L])
+- *(1) Project [CASE WHEN ((rating#2 >= 0.0) AND (rating#2 < 1.0)) THEN Very Low WHEN ((rating#2 >= 1.0) AND (rating#2 <= 2.0)) THEN Low WHEN ((rating#2 >= 3.0) AND (rating#2 <= 4.0)) THEN Medium WHEN ((rating#2 > 4.0) AND (rating#2 <= 5.0)) THEN High END AS rating_category#15]
+- *(1) Scan ExistingRDD[userId#0L,movieId#1L,rating#2,timestamp_str#3L]