Pricing
Reviews
About
Register or Login
Register or Login
Menu
Pricing
Reviews
Login
Contact
Search
About
Register or Login
Home
Data Engineering Boot Camp V2 Combined Track
Spark Batch Processing - Caching, UDFs, DataFrames, Datasets, SparkSQL, and Parquet (Day 2 Lecture)
Sign in to view content
Sign in to view this lesson and continue learning.
Sign in
Spark Batch Processing - Caching, UDFs, DataFrames, Datasets, SparkSQL, and Parquet (Day 2 Lecture)
Week 4: Batch Pipelines with Apache Spark V2
54 mins
SQL
Data Modeling
ETL/ELT
Apache Spark
Previous
Next
Overview
Description
In this lecture, we dive deeper into Spark, focusing on optimization with caching, temporary views, UDFs, DataFrame vs. Dataset vs. SparkSQL, Parquet, and tuning considerations.