ChicagoTaxiTripsAnalysis - Details for Query 5

Details for Query 5

Submitted Time: 2025/03/20 19:47:34
Duration: 47 s
Succeeded Jobs: 8

Show the Stage ID and Task ID that corresponds to the max metric

digraph G { subgraph cluster0 { isCluster="true"; label="WholeStageCodegen (5)\n \nduration: 4 ms"; 1 [labelType="html" label="HashAggregate time in aggregation build: 4 ms number of output rows: 1"]; } 2 [labelType="html" label="Exchange shuffle records written: 200 shuffle write time total (min, med, max (stageId: taskId)) 54 ms (0 ms, 0 ms, 2 ms (stage 15.0: task 360)) records read: 200 local bytes read: 5.6 KiB fetch wait time: 0 ms remote bytes read: 5.6 KiB local blocks read: 100 remote blocks read: 100 data size total (min, med, max (stageId: taskId)) 3.1 KiB (16.0 B, 16.0 B, 16.0 B (stage 15.0: task 180)) shuffle bytes written total (min, med, max (stageId: taskId)) 11.1 KiB (56.0 B, 56.0 B, 59.0 B (stage 15.0: task 180))"]; subgraph cluster3 { isCluster="true"; label="WholeStageCodegen (4)\n \nduration: total (min, med, max (stageId: taskId))\n434 ms (1 ms, 1 ms, 10 ms (stage 15.0: task 194))"; 4 [labelType="html" label="HashAggregate time in aggregation build total (min, med, max (stageId: taskId)) 380 ms (0 ms, 1 ms, 10 ms (stage 15.0: task 194)) number of output rows: 200"]; 5 [labelType="html" label="HashAggregate time in aggregation build total (min, med, max (stageId: taskId)) 169 ms (0 ms, 0 ms, 9 ms (stage 15.0: task 194)) peak memory total (min, med, max (stageId: taskId)) 4.0 GiB (256.0 KiB, 256.0 KiB, 64.3 MiB (stage 15.0: task 180)) number of output rows: 78 avg hash probe bucket list iters (min, med, max (stageId: taskId)): (1, 1, 1 (stage 15.0: task 180))"]; } 6 [labelType="html" label="Exchange shuffle records written: 3,432 shuffle write time total (min, med, max (stageId: taskId)) 271 ms (4 ms, 6 ms, 11 ms (stage 14.0: task 162)) records read: 3,432 local bytes read total (min, med, max (stageId: taskId)) 95.5 KiB (0.0 B, 0.0 B, 1896.0 B (stage 15.0: task 221)) fetch wait time total (min, med, max (stageId: taskId)) 17 ms (0 ms, 0 ms, 4 ms (stage 15.0: task 194)) remote bytes read total (min, med, max (stageId: taskId)) 96.1 KiB (0.0 B, 0.0 B, 2.0 KiB (stage 15.0: task 193)) local blocks read: 1,404 remote blocks read: 1,412 data size total (min, med, max (stageId: taskId)) 80.1 KiB (1864.0 B, 1864.0 B, 1864.0 B (stage 14.0: task 136)) shuffle bytes written total (min, med, max (stageId: taskId)) 191.6 KiB (4.4 KiB, 4.4 KiB, 4.4 KiB (stage 14.0: task 156))"]; subgraph cluster7 { isCluster="true"; label="WholeStageCodegen (3)\n \nduration: total (min, med, max (stageId: taskId))\n1.5 m (643 ms, 2.1 s, 2.5 s (stage 14.0: task 159))"; 8 [labelType="html" label="HashAggregate time in aggregation build total (min, med, max (stageId: taskId)) 1.5 m (635 ms, 2.1 s, 2.5 s (stage 14.0: task 159)) peak memory total (min, med, max (stageId: taskId)) 2.8 GiB (64.3 MiB, 64.3 MiB, 64.3 MiB (stage 14.0: task 136)) number of output rows: 3,432 avg hash probe bucket list iters (min, med, max (stageId: taskId)): (1, 1, 1 (stage 14.0: task 136))"]; } 9 [labelType="html" label=" Union "]; subgraph cluster10 { isCluster="true"; label="WholeStageCodegen (1)\n \nduration: total (min, med, max (stageId: taskId))\n44.3 s (0 ms, 662 ms, 2.4 s (stage 14.0: task 137))"; 11 [labelType="html" label=" Project "]; } 12 [labelType="html" label="Scan csv number of files read: 1 metadata time: 0 ms size of files read: 2.7 GiB number of output rows: 6,905,288"]; subgraph cluster13 { isCluster="true"; label="WholeStageCodegen (2)\n \nduration: total (min, med, max (stageId: taskId))\n43.8 s (0 ms, 638 ms, 2.5 s (stage 14.0: task 159))"; 14 [labelType="html" label=" Project "]; } 15 [labelType="html" label="Scan csv number of files read: 1 metadata time: 0 ms size of files read: 2.7 GiB number of output rows: 6,905,288"]; 2->1; 4->2; 5->4; 6->5; 8->6; 9->8; 11->9; 12->11; 14->9; 15->14; }

HashAggregate(keys=[], functions=[count(1)])

WholeStageCodegen (5)

Exchange SinglePartition, true, [id=#168]

HashAggregate(keys=[], functions=[partial_count(1)])

HashAggregate(keys=[id#90], functions=[])

WholeStageCodegen (4)

Exchange hashpartitioning(id#90, 200), true, [id=#163]

HashAggregate(keys=[id#90], functions=[])

WholeStageCodegen (3)

Union

Project [Pickup Community Area#24 AS id#90]

WholeStageCodegen (1)

FileScan csv [Pickup Community Area#24] Batched: false, DataFilters: [], Format: CSV, Location: InMemoryFileIndex[s3a://data-repository-bkt/ECS765/Chicago_Taxitrips/chicago_taxi_trips.csv], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<Pickup Community Area:string>

Project [Dropoff Community Area#25 AS id#98]

WholeStageCodegen (2)

FileScan csv [Dropoff Community Area#25] Batched: false, DataFilters: [], Format: CSV, Location: InMemoryFileIndex[s3a://data-repository-bkt/ECS765/Chicago_Taxitrips/chicago_taxi_trips.csv], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<Dropoff Community Area:string>

Details

== Parsed Logical Plan ==
Aggregate [count(1) AS count#176L]
+- Deduplicate [id#90]
   +- Union
      :- Project [Pickup Community Area#24 AS id#90, Pickup Centroid Longitude AS Longitude#91, Pickup Centroid Latitude AS Latitude#92, Pickup Census Tract AS Census Tract#93]
      :  +- Relation[Trip ID#16,Taxi ID#17,Trip Start Timestamp#18,Trip End Timestamp#19,Trip Seconds#20,Trip Miles#21,Pickup Census Tract#22,Dropoff Census Tract#23,Pickup Community Area#24,Dropoff Community Area#25,Fare#26,Tips#27,Tolls#28,Extras#29,Trip Total#30,Payment Type#31,Company#32,Pickup Centroid Latitude#33,Pickup Centroid Longitude#34,Pickup Centroid Location#35,Dropoff Centroid Latitude#36,Dropoff Centroid Longitude#37,Dropoff Centroid  Location#38] csv
      +- Project [Dropoff Community Area#25 AS id#98, Dropoff Centroid Longitude AS Longitude#99, Dropoff Centroid Latitude AS Latitude#100, Dropoff Census Tract AS Census Tract#101]
         +- Relation[Trip ID#16,Taxi ID#17,Trip Start Timestamp#18,Trip End Timestamp#19,Trip Seconds#20,Trip Miles#21,Pickup Census Tract#22,Dropoff Census Tract#23,Pickup Community Area#24,Dropoff Community Area#25,Fare#26,Tips#27,Tolls#28,Extras#29,Trip Total#30,Payment Type#31,Company#32,Pickup Centroid Latitude#33,Pickup Centroid Longitude#34,Pickup Centroid Location#35,Dropoff Centroid Latitude#36,Dropoff Centroid Longitude#37,Dropoff Centroid  Location#38] csv

== Analyzed Logical Plan ==
count: bigint
Aggregate [count(1) AS count#176L]
+- Deduplicate [id#90]
   +- Union
      :- Project [Pickup Community Area#24 AS id#90, Pickup Centroid Longitude AS Longitude#91, Pickup Centroid Latitude AS Latitude#92, Pickup Census Tract AS Census Tract#93]
      :  +- Relation[Trip ID#16,Taxi ID#17,Trip Start Timestamp#18,Trip End Timestamp#19,Trip Seconds#20,Trip Miles#21,Pickup Census Tract#22,Dropoff Census Tract#23,Pickup Community Area#24,Dropoff Community Area#25,Fare#26,Tips#27,Tolls#28,Extras#29,Trip Total#30,Payment Type#31,Company#32,Pickup Centroid Latitude#33,Pickup Centroid Longitude#34,Pickup Centroid Location#35,Dropoff Centroid Latitude#36,Dropoff Centroid Longitude#37,Dropoff Centroid  Location#38] csv
      +- Project [Dropoff Community Area#25 AS id#98, Dropoff Centroid Longitude AS Longitude#99, Dropoff Centroid Latitude AS Latitude#100, Dropoff Census Tract AS Census Tract#101]
         +- Relation[Trip ID#16,Taxi ID#17,Trip Start Timestamp#18,Trip End Timestamp#19,Trip Seconds#20,Trip Miles#21,Pickup Census Tract#22,Dropoff Census Tract#23,Pickup Community Area#24,Dropoff Community Area#25,Fare#26,Tips#27,Tolls#28,Extras#29,Trip Total#30,Payment Type#31,Company#32,Pickup Centroid Latitude#33,Pickup Centroid Longitude#34,Pickup Centroid Location#35,Dropoff Centroid Latitude#36,Dropoff Centroid Longitude#37,Dropoff Centroid  Location#38] csv

== Optimized Logical Plan ==
Aggregate [count(1) AS count#176L]
+- Aggregate [id#90]
   +- Union
      :- Project [Pickup Community Area#24 AS id#90]
      :  +- Relation[Trip ID#16,Taxi ID#17,Trip Start Timestamp#18,Trip End Timestamp#19,Trip Seconds#20,Trip Miles#21,Pickup Census Tract#22,Dropoff Census Tract#23,Pickup Community Area#24,Dropoff Community Area#25,Fare#26,Tips#27,Tolls#28,Extras#29,Trip Total#30,Payment Type#31,Company#32,Pickup Centroid Latitude#33,Pickup Centroid Longitude#34,Pickup Centroid Location#35,Dropoff Centroid Latitude#36,Dropoff Centroid Longitude#37,Dropoff Centroid  Location#38] csv
      +- Project [Dropoff Community Area#25 AS id#98]
         +- Relation[Trip ID#16,Taxi ID#17,Trip Start Timestamp#18,Trip End Timestamp#19,Trip Seconds#20,Trip Miles#21,Pickup Census Tract#22,Dropoff Census Tract#23,Pickup Community Area#24,Dropoff Community Area#25,Fare#26,Tips#27,Tolls#28,Extras#29,Trip Total#30,Payment Type#31,Company#32,Pickup Centroid Latitude#33,Pickup Centroid Longitude#34,Pickup Centroid Location#35,Dropoff Centroid Latitude#36,Dropoff Centroid Longitude#37,Dropoff Centroid  Location#38] csv

== Physical Plan ==
*(5) HashAggregate(keys=[], functions=[count(1)], output=[count#176L])
+- Exchange SinglePartition, true, [id=#168]
   +- *(4) HashAggregate(keys=[], functions=[partial_count(1)], output=[count#182L])
      +- *(4) HashAggregate(keys=[id#90], functions=[], output=[])
         +- Exchange hashpartitioning(id#90, 200), true, [id=#163]
            +- *(3) HashAggregate(keys=[id#90], functions=[], output=[id#90])
               +- Union
                  :- *(1) Project [Pickup Community Area#24 AS id#90]
                  :  +- FileScan csv [Pickup Community Area#24] Batched: false, DataFilters: [], Format: CSV, Location: InMemoryFileIndex[s3a://data-repository-bkt/ECS765/Chicago_Taxitrips/chicago_taxi_trips.csv], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<Pickup Community Area:string>
                  +- *(2) Project [Dropoff Community Area#25 AS id#98]
                     +- FileScan csv [Dropoff Community Area#25] Batched: false, DataFilters: [], Format: CSV, Location: InMemoryFileIndex[s3a://data-repository-bkt/ECS765/Chicago_Taxitrips/chicago_taxi_trips.csv], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<Dropoff Community Area:string>