digraph G {
subgraph cluster0 {
isCluster="true";
label="WholeStageCodegen (2)\n \nduration: 12 ms";
1 [labelType="html" label="<b>HashAggregate</b><br><br>time in aggregation build: 11 ms<br>number of output rows: 1"];
}
2 [labelType="html" label="<b>Exchange</b><br><br>shuffle records written: 22<br>shuffle write time total (min, med, max (stageId: taskId))<br>46 ms (0 ms, 0 ms, 19 ms (stage 2.0: task 23))<br>records read: 22<br>local bytes read: 590.0 B<br>fetch wait time: 0 ms<br>remote bytes read: 708.0 B<br>local blocks read: 10<br>remote blocks read: 12<br>data size total (min, med, max (stageId: taskId))<br>352.0 B (16.0 B, 16.0 B, 16.0 B (stage 2.0: task 24))<br>shuffle bytes written total (min, med, max (stageId: taskId))<br>1298.0 B (59.0 B, 59.0 B, 59.0 B (stage 2.0: task 24))"];
subgraph cluster3 {
isCluster="true";
label="WholeStageCodegen (1)\n \nduration: total (min, med, max (stageId: taskId))\n29.6 s (416 ms, 1.3 s, 2.0 s (stage 2.0: task 23))";
4 [labelType="html" label="<b>HashAggregate</b><br><br>time in aggregation build total (min, med, max (stageId: taskId))<br>29.6 s (415 ms, 1.3 s, 2.0 s (stage 2.0: task 23))<br>number of output rows: 22"];
}
5 [labelType="html" label="<b>Scan csv </b><br><br>number of files read: 1<br>metadata time: 0 ms<br>size of files read: 2.7 GiB<br>number of output rows: 6,905,288"];
2->1;
4->2;
5->4;
}
6
HashAggregate(keys=[], functions=[count(1)])
WholeStageCodegen (2)
Exchange SinglePartition, true, [id=#34]
HashAggregate(keys=[], functions=[partial_count(1)])
WholeStageCodegen (1)
FileScan csv [] Batched: false, DataFilters: [], Format: CSV, Location: InMemoryFileIndex[s3a://data-repository-bkt/ECS765/Chicago_Taxitrips/chicago_taxi_trips.csv], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<>
== Parsed Logical Plan ==
Aggregate [count(1) AS count#86L]
+- Relation[Trip ID#16,Taxi ID#17,Trip Start Timestamp#18,Trip End Timestamp#19,Trip Seconds#20,Trip Miles#21,Pickup Census Tract#22L,Dropoff Census Tract#23L,Pickup Community Area#24,Dropoff Community Area#25,Fare#26,Tips#27,Tolls#28,Extras#29,Trip Total#30,Payment Type#31,Company#32,Pickup Centroid Latitude#33,Pickup Centroid Longitude#34,Pickup Centroid Location#35,Dropoff Centroid Latitude#36,Dropoff Centroid Longitude#37,Dropoff Centroid Location#38] csv
== Analyzed Logical Plan ==
count: bigint
Aggregate [count(1) AS count#86L]
+- Relation[Trip ID#16,Taxi ID#17,Trip Start Timestamp#18,Trip End Timestamp#19,Trip Seconds#20,Trip Miles#21,Pickup Census Tract#22L,Dropoff Census Tract#23L,Pickup Community Area#24,Dropoff Community Area#25,Fare#26,Tips#27,Tolls#28,Extras#29,Trip Total#30,Payment Type#31,Company#32,Pickup Centroid Latitude#33,Pickup Centroid Longitude#34,Pickup Centroid Location#35,Dropoff Centroid Latitude#36,Dropoff Centroid Longitude#37,Dropoff Centroid Location#38] csv
== Optimized Logical Plan ==
Aggregate [count(1) AS count#86L]
+- Project
+- Relation[Trip ID#16,Taxi ID#17,Trip Start Timestamp#18,Trip End Timestamp#19,Trip Seconds#20,Trip Miles#21,Pickup Census Tract#22L,Dropoff Census Tract#23L,Pickup Community Area#24,Dropoff Community Area#25,Fare#26,Tips#27,Tolls#28,Extras#29,Trip Total#30,Payment Type#31,Company#32,Pickup Centroid Latitude#33,Pickup Centroid Longitude#34,Pickup Centroid Location#35,Dropoff Centroid Latitude#36,Dropoff Centroid Longitude#37,Dropoff Centroid Location#38] csv
== Physical Plan ==
*(2) HashAggregate(keys=[], functions=[count(1)], output=[count#86L])
+- Exchange SinglePartition, true, [id=#34]
+- *(1) HashAggregate(keys=[], functions=[partial_count(1)], output=[count#89L])
+- FileScan csv [] Batched: false, DataFilters: [], Format: CSV, Location: InMemoryFileIndex[s3a://data-repository-bkt/ECS765/Chicago_Taxitrips/chicago_taxi_trips.csv], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<>