pyspark.sql.DataFrame.crossJoin¶
-
DataFrame.
crossJoin
(other: pyspark.sql.dataframe.DataFrame) → pyspark.sql.dataframe.DataFrame[source]¶ Returns the cartesian product with another
DataFrame
.New in version 2.1.0.
Changed in version 3.4.0: Supports Spark Connect.
Examples
>>> from pyspark.sql import Row >>> df = spark.createDataFrame( ... [(14, "Tom"), (23, "Alice"), (16, "Bob")], ["age", "name"]) >>> df2 = spark.createDataFrame( ... [Row(height=80, name="Tom"), Row(height=85, name="Bob")]) >>> df.crossJoin(df2.select("height")).select("age", "name", "height").show() +---+-----+------+ |age| name|height| +---+-----+------+ | 14| Tom| 80| | 14| Tom| 85| | 23|Alice| 80| | 23|Alice| 85| | 16| Bob| 80| | 16| Bob| 85| +---+-----+------+