ChicagoTaxiTripsAnalysis - Details for Query 4

Details for Query 4

Submitted Time: 2025/03/20 20:18:13
Duration: 46 s
Succeeded Jobs: 7

Show the Stage ID and Task ID that corresponds to the max metric

digraph G { subgraph cluster0 { isCluster="true"; label="WholeStageCodegen (5)\n \nduration: 4 ms"; 1 [labelType="html" label="HashAggregate time in aggregation build: 4 ms number of output rows: 1"]; } 2 [labelType="html" label="Exchange shuffle records written: 200 shuffle write time total (min, med, max (stageId: taskId)) 52 ms (0 ms, 0 ms, 0 ms (stage 13.0: task 176)) records read: 200 local bytes read: 5.7 KiB fetch wait time: 0 ms remote bytes read: 5.4 KiB local blocks read: 103 remote blocks read: 97 data size total (min, med, max (stageId: taskId)) 3.1 KiB (16.0 B, 16.0 B, 16.0 B (stage 13.0: task 157)) shuffle bytes written total (min, med, max (stageId: taskId)) 11.1 KiB (56.0 B, 56.0 B, 59.0 B (stage 13.0: task 157))"]; subgraph cluster3 { isCluster="true"; label="WholeStageCodegen (4)\n \nduration: total (min, med, max (stageId: taskId))\n458 ms (1 ms, 1 ms, 10 ms (stage 13.0: task 159))"; 4 [labelType="html" label="HashAggregate time in aggregation build total (min, med, max (stageId: taskId)) 426 ms (0 ms, 1 ms, 9 ms (stage 13.0: task 159)) number of output rows: 200"]; 5 [labelType="html" label="HashAggregate time in aggregation build total (min, med, max (stageId: taskId)) 198 ms (0 ms, 0 ms, 8 ms (stage 13.0: task 159)) peak memory total (min, med, max (stageId: taskId)) 4.0 GiB (256.0 KiB, 256.0 KiB, 64.3 MiB (stage 13.0: task 157)) number of output rows: 78 avg hash probe bucket list iters (min, med, max (stageId: taskId)): (1, 1, 1 (stage 13.0: task 157))"]; } 6 [labelType="html" label="Exchange shuffle records written: 3,432 shuffle write time total (min, med, max (stageId: taskId)) 277 ms (4 ms, 6 ms, 9 ms (stage 12.0: task 122)) records read: 3,432 local bytes read total (min, med, max (stageId: taskId)) 96.1 KiB (0.0 B, 0.0 B, 2.0 KiB (stage 13.0: task 170)) fetch wait time total (min, med, max (stageId: taskId)) 24 ms (0 ms, 0 ms, 3 ms (stage 13.0: task 159)) remote bytes read total (min, med, max (stageId: taskId)) 95.5 KiB (0.0 B, 0.0 B, 1896.0 B (stage 13.0: task 198)) local blocks read: 1,412 remote blocks read: 1,404 data size total (min, med, max (stageId: taskId)) 80.1 KiB (1864.0 B, 1864.0 B, 1864.0 B (stage 12.0: task 113)) shuffle bytes written total (min, med, max (stageId: taskId)) 191.6 KiB (4.4 KiB, 4.4 KiB, 4.4 KiB (stage 12.0: task 133))"]; subgraph cluster7 { isCluster="true"; label="WholeStageCodegen (3)\n \nduration: total (min, med, max (stageId: taskId))\n1.4 m (628 ms, 2.0 s, 2.5 s (stage 12.0: task 118))"; 8 [labelType="html" label="HashAggregate time in aggregation build total (min, med, max (stageId: taskId)) 1.4 m (612 ms, 1.9 s, 2.5 s (stage 12.0: task 118)) peak memory total (min, med, max (stageId: taskId)) 2.8 GiB (64.3 MiB, 64.3 MiB, 64.3 MiB (stage 12.0: task 113)) number of output rows: 3,432 avg hash probe bucket list iters (min, med, max (stageId: taskId)): (1, 1, 1 (stage 12.0: task 113))"]; } 9 [labelType="html" label=" Union "]; subgraph cluster10 { isCluster="true"; label="WholeStageCodegen (1)\n \nduration: total (min, med, max (stageId: taskId))\n43.3 s (0 ms, 725 ms, 2.5 s (stage 12.0: task 118))"; 11 [labelType="html" label=" Project "]; } 12 [labelType="html" label="Scan csv number of files read: 1 metadata time: 0 ms size of files read: 2.7 GiB number of output rows: 6,905,288"]; subgraph cluster13 { isCluster="true"; label="WholeStageCodegen (2)\n \nduration: total (min, med, max (stageId: taskId))\n42.6 s (0 ms, 614 ms, 2.4 s (stage 12.0: task 137))"; 14 [labelType="html" label=" Project "]; } 15 [labelType="html" label="Scan csv number of files read: 1 metadata time: 0 ms size of files read: 2.7 GiB number of output rows: 6,905,288"]; 2->1; 4->2; 5->4; 6->5; 8->6; 9->8; 11->9; 12->11; 14->9; 15->14; }

HashAggregate(keys=[], functions=[count(1)])

WholeStageCodegen (5)

Exchange SinglePartition, true, [id=#149]

HashAggregate(keys=[], functions=[partial_count(1)])

HashAggregate(keys=[id#90], functions=[])

WholeStageCodegen (4)

Exchange hashpartitioning(id#90, 200), true, [id=#144]

HashAggregate(keys=[id#90], functions=[])

WholeStageCodegen (3)

Union

Project [Pickup Community Area#24 AS id#90]

WholeStageCodegen (1)

FileScan csv [Pickup Community Area#24] Batched: false, DataFilters: [], Format: CSV, Location: InMemoryFileIndex[s3a://data-repository-bkt/ECS765/Chicago_Taxitrips/chicago_taxi_trips.csv], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<Pickup Community Area:string>

Project [Dropoff Community Area#25 AS id#98]

WholeStageCodegen (2)

FileScan csv [Dropoff Community Area#25] Batched: false, DataFilters: [], Format: CSV, Location: InMemoryFileIndex[s3a://data-repository-bkt/ECS765/Chicago_Taxitrips/chicago_taxi_trips.csv], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<Dropoff Community Area:string>

Details

== Parsed Logical Plan ==
Aggregate [count(1) AS count#167L]
+- Deduplicate [id#90]
   +- Union
      :- Project [Pickup Community Area#24 AS id#90, Pickup Centroid Longitude AS Longitude#91, Pickup Centroid Latitude AS Latitude#92, Pickup Census Tract AS Census Tract#93]
      :  +- Relation[Trip ID#16,Taxi ID#17,Trip Start Timestamp#18,Trip End Timestamp#19,Trip Seconds#20,Trip Miles#21,Pickup Census Tract#22,Dropoff Census Tract#23,Pickup Community Area#24,Dropoff Community Area#25,Fare#26,Tips#27,Tolls#28,Extras#29,Trip Total#30,Payment Type#31,Company#32,Pickup Centroid Latitude#33,Pickup Centroid Longitude#34,Pickup Centroid Location#35,Dropoff Centroid Latitude#36,Dropoff Centroid Longitude#37,Dropoff Centroid  Location#38] csv
      +- Project [Dropoff Community Area#25 AS id#98, Dropoff Centroid Longitude AS Longitude#99, Dropoff Centroid Latitude AS Latitude#100, Dropoff Census Tract AS Census Tract#101]
         +- Relation[Trip ID#16,Taxi ID#17,Trip Start Timestamp#18,Trip End Timestamp#19,Trip Seconds#20,Trip Miles#21,Pickup Census Tract#22,Dropoff Census Tract#23,Pickup Community Area#24,Dropoff Community Area#25,Fare#26,Tips#27,Tolls#28,Extras#29,Trip Total#30,Payment Type#31,Company#32,Pickup Centroid Latitude#33,Pickup Centroid Longitude#34,Pickup Centroid Location#35,Dropoff Centroid Latitude#36,Dropoff Centroid Longitude#37,Dropoff Centroid  Location#38] csv

== Analyzed Logical Plan ==
count: bigint
Aggregate [count(1) AS count#167L]
+- Deduplicate [id#90]
   +- Union
      :- Project [Pickup Community Area#24 AS id#90, Pickup Centroid Longitude AS Longitude#91, Pickup Centroid Latitude AS Latitude#92, Pickup Census Tract AS Census Tract#93]
      :  +- Relation[Trip ID#16,Taxi ID#17,Trip Start Timestamp#18,Trip End Timestamp#19,Trip Seconds#20,Trip Miles#21,Pickup Census Tract#22,Dropoff Census Tract#23,Pickup Community Area#24,Dropoff Community Area#25,Fare#26,Tips#27,Tolls#28,Extras#29,Trip Total#30,Payment Type#31,Company#32,Pickup Centroid Latitude#33,Pickup Centroid Longitude#34,Pickup Centroid Location#35,Dropoff Centroid Latitude#36,Dropoff Centroid Longitude#37,Dropoff Centroid  Location#38] csv
      +- Project [Dropoff Community Area#25 AS id#98, Dropoff Centroid Longitude AS Longitude#99, Dropoff Centroid Latitude AS Latitude#100, Dropoff Census Tract AS Census Tract#101]
         +- Relation[Trip ID#16,Taxi ID#17,Trip Start Timestamp#18,Trip End Timestamp#19,Trip Seconds#20,Trip Miles#21,Pickup Census Tract#22,Dropoff Census Tract#23,Pickup Community Area#24,Dropoff Community Area#25,Fare#26,Tips#27,Tolls#28,Extras#29,Trip Total#30,Payment Type#31,Company#32,Pickup Centroid Latitude#33,Pickup Centroid Longitude#34,Pickup Centroid Location#35,Dropoff Centroid Latitude#36,Dropoff Centroid Longitude#37,Dropoff Centroid  Location#38] csv

== Optimized Logical Plan ==
Aggregate [count(1) AS count#167L]
+- Aggregate [id#90]
   +- Union
      :- Project [Pickup Community Area#24 AS id#90]
      :  +- Relation[Trip ID#16,Taxi ID#17,Trip Start Timestamp#18,Trip End Timestamp#19,Trip Seconds#20,Trip Miles#21,Pickup Census Tract#22,Dropoff Census Tract#23,Pickup Community Area#24,Dropoff Community Area#25,Fare#26,Tips#27,Tolls#28,Extras#29,Trip Total#30,Payment Type#31,Company#32,Pickup Centroid Latitude#33,Pickup Centroid Longitude#34,Pickup Centroid Location#35,Dropoff Centroid Latitude#36,Dropoff Centroid Longitude#37,Dropoff Centroid  Location#38] csv
      +- Project [Dropoff Community Area#25 AS id#98]
         +- Relation[Trip ID#16,Taxi ID#17,Trip Start Timestamp#18,Trip End Timestamp#19,Trip Seconds#20,Trip Miles#21,Pickup Census Tract#22,Dropoff Census Tract#23,Pickup Community Area#24,Dropoff Community Area#25,Fare#26,Tips#27,Tolls#28,Extras#29,Trip Total#30,Payment Type#31,Company#32,Pickup Centroid Latitude#33,Pickup Centroid Longitude#34,Pickup Centroid Location#35,Dropoff Centroid Latitude#36,Dropoff Centroid Longitude#37,Dropoff Centroid  Location#38] csv

== Physical Plan ==
*(5) HashAggregate(keys=[], functions=[count(1)], output=[count#167L])
+- Exchange SinglePartition, true, [id=#149]
   +- *(4) HashAggregate(keys=[], functions=[partial_count(1)], output=[count#173L])
      +- *(4) HashAggregate(keys=[id#90], functions=[], output=[])
         +- Exchange hashpartitioning(id#90, 200), true, [id=#144]
            +- *(3) HashAggregate(keys=[id#90], functions=[], output=[id#90])
               +- Union
                  :- *(1) Project [Pickup Community Area#24 AS id#90]
                  :  +- FileScan csv [Pickup Community Area#24] Batched: false, DataFilters: [], Format: CSV, Location: InMemoryFileIndex[s3a://data-repository-bkt/ECS765/Chicago_Taxitrips/chicago_taxi_trips.csv], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<Pickup Community Area:string>
                  +- *(2) Project [Dropoff Community Area#25 AS id#98]
                     +- FileScan csv [Dropoff Community Area#25] Batched: false, DataFilters: [], Format: CSV, Location: InMemoryFileIndex[s3a://data-repository-bkt/ECS765/Chicago_Taxitrips/chicago_taxi_trips.csv], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<Dropoff Community Area:string>