About 50 results
Open links in new tab
  1. Comparison operator in PySpark (not equal/ !=) - Stack Overflow

    Aug 24, 2016 · The selected correct answer does not address the question, and the other answers are all wrong for pyspark. There is no "!=" operator equivalent in pyspark for this …

  2. pyspark - How to use AND or OR condition in when in Spark

    107 pyspark.sql.functions.when takes a Boolean Column as its condition. When using PySpark, it's often useful to think "Column Expression" when you read "Column". Logical operations on …

  3. cannot resolve column due to data type mismatch PySpark

    Mar 12, 2020 · cannot resolve column due to data type mismatch PySpark Asked 5 years, 9 months ago Modified 4 years, 9 months ago Viewed 39k times

  4. Show distinct column values in pyspark dataframe - Stack Overflow

    With pyspark dataframe, how do you do the equivalent of Pandas df['col'].unique(). I want to list out all the unique values in a pyspark dataframe column. Not the SQL type way …

  5. PySpark: multiple conditions in when clause - Stack Overflow

    Jun 8, 2016 · Very helpful observation when in pyspark multiple conditions can be built using & (for and) and | (for or). Note:In pyspark t is important to enclose every expressions within …

  6. python - PySpark: "Exception: Java gateway process exited before ...

    I'm trying to run PySpark on my MacBook Air. When I try starting it up, I get the error: Exception: Java gateway process exited before sending the driver its port number when sc = …

  7. How to find count of Null and Nan values for each column in a …

    Jun 19, 2017 · How to find count of Null and Nan values for each column in a PySpark dataframe efficiently? Asked 8 years, 6 months ago Modified 2 years, 8 months ago Viewed 291k times

  8. Pyspark dataframe LIKE operator - Stack Overflow

    Oct 24, 2016 · What is the equivalent in Pyspark for LIKE operator? For example I would like to do: SELECT * FROM table WHERE column LIKE "*somestring*"; looking for something easy …

  9. pyspark - Adding a dataframe to an existing delta table throws …

    Jun 9, 2024 · Fix Issue was due to mismatched data types. Explicitly declaring schema type resolved the issue. schema = StructType([ StructField("_id", StringType(), True), …

  10. How do I replace a string value with a NULL in PySpark?

    Mar 7, 2023 · I want to do something like this: df.replace('empty-value', None, 'NAME') Basically, I want to replace some value with NULL, but it does not accept None as an argument. How can …