• if you’re programming in Scala, this particular page in the docs regarding the Dataset API will be your friend
  • DataFrames are statically typed, DataSets are collections of objects
  • “Dataset APIs are all expressed as lambda (anonymous) functions and JVM typed objects”
    • datasets are composed of Dataset[T], strongly typed JVM objects; dataframes are composed of Dataset[Row], untyped JVM objects
    • :star: dataframes vs datasets vs RDDs