Introduction to Databricks

$1,250.00

Location: On-Site or Online
Pricing: $1,250 per seat (6-seat minimum)
Length: 3 Days

Course Summary

Introduction to Databricks is a practical, hands-on course designed to teach students how to build analytics and data engineering workloads using the Databricks Lakehouse Platform.

Students learn how Databricks unifies data lakes and data warehouses into a single platform for ingestion, transformation, analytics, and collaboration. The course emphasizes real-world workflows using Apache Spark, notebooks, Delta Lake, and production-oriented best practices.

By the end of the course, students are comfortable working within Databricks workspaces, processing data at scale, managing Delta tables, and supporting analytics pipelines in real environments.

Course Outline

Day 1 – Databricks and Lakehouse Fundamentals

💬 Lecture: Modern data platforms and the Lakehouse concept
💬 Lecture: Databricks architecture and core components
💬 Lecture: Workspaces, clusters, and compute models
💬 Lecture: Introduction to Apache Spark in Databricks
⚙️ Lab: Accessing a Databricks workspace
⚙️ Lab: Creating and managing clusters
⚙️ Lab: Exploring Databricks notebooks
⚙️ Lab: Running basic Spark jobs
⚙️ Lab: Understanding job execution and cluster behavior
💬 Lecture: Data sources, formats, and ingestion patterns
⚙️ Lab: Ingesting data from cloud storage
⚙️ Lab: Reading CSV and JSON data with Spark

Day 2 – Data Transformation and Analytics with Delta Lake

💬 Lecture: Structured vs semi-structured data
💬 Lecture: Delta Lake fundamentals
💬 Lecture: Spark SQL and DataFrame APIs
⚙️ Lab: Writing data to Delta tables
⚙️ Lab: Updating and deleting data with Delta Lake
⚙️ Lab: Inspecting table history and using time travel
💬 Lecture: Querying data for analytics
💬 Lecture: Databricks SQL and visualizations
⚙️ Lab: Running analytical queries on Delta tables
⚙️ Lab: Creating visualizations in notebooks
⚙️ Lab: Sharing notebooks with teammates
💬 Lecture: Jobs, workflows, and scheduling
⚙️ Lab: Creating a scheduled Databricks job
⚙️ Lab: Monitoring job execution and logs

Day 3 – Production Workflows, Governance, and Performance

💬 Lecture: Data governance and security concepts
💬 Lecture: Access control and permissions
💬 Lecture: Performance tuning basics
⚙️ Lab: Managing workspace permissions
⚙️ Lab: Securing tables and data access
⚙️ Lab: Optimizing queries and cluster usage
💬 Lecture: Cost management and operational best practices
💬 Lecture: Databricks in production environments
⚙️ Lab: Monitoring usage and cost drivers
⚙️ Lab: Designing a complete Lakehouse workflow
⚙️ Lab: Combining ingestion, transformation, and analytics
⚙️ Lab: Validating data accuracy and performance

Outcomes

Students who complete Introduction to Databricks will be able to:

Navigate and use Databricks workspaces confidently
Process data at scale using Apache Spark
Build and manage Delta Lake tables
Create analytics workflows and scheduled jobs
Apply governance, performance, and cost best practices
Support production Lakehouse environments effectively