pyspark.sql.DataFrameWriter.saveAsTable¶

DataFrameWriter.saveAsTable(name: str, format: Optional[str] = None, mode: Optional[str] = None, partitionBy: Union[str, List[str], None] = None, **options: OptionalPrimitiveType) → None[source]¶

Saves the content of the DataFrame as the specified table.

In the case the table already exists, behavior of this function depends on the save mode, specified by the mode function (default to throwing an exception). When mode is Overwrite, the schema of the DataFrame does not need to be the same as that of the existing table.

append: Append contents of this DataFrame to existing data.
overwrite: Overwrite existing data.
error or errorifexists: Throw an exception if data already exists.
ignore: Silently ignore this operation if data already exists.

New in version 1.4.0.

Changed in version 3.4.0: Supports Spark Connect.

Parameters

namestr: the table name
formatstr, optional: the format used to save
modestr, optional: one of append, overwrite, error, errorifexists, ignore (default: error)
partitionBystr or list: names of partitioning columns
**optionsdict: all other string options

Notes

When mode is Append, if there is an existing table, we will use the format and options of the existing table. The column order in the schema of the DataFrame doesn’t need to be the same as that of the existing table. Unlike DataFrameWriter.insertInto(), DataFrameWriter.saveAsTable() will use the column names to find the correct column positions.

Examples

Creates a table from a DataFrame, and read it back.

>>> _ = spark.sql("DROP TABLE IF EXISTS tblA")
>>> spark.createDataFrame([
...     (100, "Hyukjin Kwon"), (120, "Hyukjin Kwon"), (140, "Haejoon Lee")],
...     schema=["age", "name"]
... ).write.saveAsTable("tblA")
>>> spark.read.table("tblA").sort("age").show()
+---+------------+
|age|        name|
+---+------------+
|100|Hyukjin Kwon|
|120|Hyukjin Kwon|
|140| Haejoon Lee|
+---+------------+
>>> _ = spark.sql("DROP TABLE tblA")

pyspark.sql.DataFrameWriter.save pyspark.sql.DataFrameWriter.sortBy