pyspark.sql.functions.try_sum#
- pyspark.sql.functions.try_sum(col)[source]#
Returns the sum calculated from values of a group and the result is null on overflow.
New in version 3.5.0.
- Parameters
- col
Column
or str
- col
Examples
Example 1: Calculating the sum of values in a column
>>> from pyspark.sql import functions as sf >>> df = spark.range(10) >>> df.select(sf.try_sum(df["id"])).show() +-----------+ |try_sum(id)| +-----------+ | 45| +-----------+
Example 2: Using a plus expression together to calculate the sum
>>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame([(1, 2), (3, 4)], ["A", "B"]) >>> df.select(sf.try_sum(sf.col("A") + sf.col("B"))).show() +----------------+ |try_sum((A + B))| +----------------+ | 10| +----------------+
Example 3: Calculating the summation of ages with None
>>> import pyspark.sql.functions as sf >>> df = spark.createDataFrame([(1982, None), (1990, 2), (2000, 4)], ["birth", "age"]) >>> df.select(sf.try_sum("age")).show() +------------+ |try_sum(age)| +------------+ | 6| +------------+
Example 4: Overflow results in NULL when ANSI mode is on
>>> from decimal import Decimal >>> import pyspark.sql.functions as sf >>> origin = spark.conf.get("spark.sql.ansi.enabled") >>> spark.conf.set("spark.sql.ansi.enabled", "true") >>> try: ... df = spark.createDataFrame([(Decimal("1" * 38),)] * 10, "number DECIMAL(38, 0)") ... df.select(sf.try_sum(df.number)).show() ... finally: ... spark.conf.set("spark.sql.ansi.enabled", origin) +---------------+ |try_sum(number)| +---------------+ | NULL| +---------------+