Data Engineering with Databricks

Price
$1,500.00 USD

Duration
2 Days

 

Delivery Methods
Virtual Instructor Led
Private Group

Course Overview

Data professionals from all walks of life will benefit from this comprehensive introduction to the components of the Databricks Lakehouse Platform that directly support putting ETL pipelines into production. You will leverage SQL and Python to define and schedule pipelines that incrementally process new data from a variety of data sources to power analytic applications and dashboards in the Lakehouse. This course offers hands-on instruction in Databricks Data Science & Engineering Workspace, Databricks SQL, Delta Live Tables, Databricks Repos, Databricks Task Orchestration, and the Unity Catalog.

This course will prepare you to take the Databricks Certified Data Engineer Associate exam.

Course Objectives

  • Leverage the Databricks Lakehouse Platform to perform core responsibilities for data pipeline development
  • Use SQL and Python to write production data pipelines to extract, transform, and load data into tables and views in the Lakehouse
  • Simplify data ingestion and incremental change propagation using Databricks-native features and syntax, including Delta Live Tables
  • Orchestrate production pipelines to deliver fresh results for ad-hoc analytics and dashboarding
  • Top-rated instructors: Our crew of subject matter experts have an average instructor rating of 4.8 out of 5 across thousands of reviews.
  • Authorized content: We maintain more than 35 Authorized Training Partnerships with the top players in tech, ensuring your course materials contain the most relevant and up-to date information.
  • Interactive classroom participation: Our virtual training includes live lectures, demonstrations and virtual labs that allow you to participate in discussions with your instructor and fellow classmates to get real-time feedback.
  • Post Class Resources: Review your class content, catch up on any material you may have missed or perfect your new skills with access to resources after your course is complete.
  • Private Group Training: Let our world-class instructors deliver exclusive training courses just for your employees. Our private group training is designed to promote your team’s shared growth and skill development.
  • Tailored Training Solutions: Our subject matter experts can customize the class to specifically address the unique goals of your team.

Course Prerequisites

  • Basic knowledge of SQL query syntax, including writing queries using SELECT, WHERE, GROUP BY, ORDER BY, LIMIT, and JOIN
  • Basic knowledge of SQL DDL statements to create, alter, and drop databases and tables
  • Basic knowledge of SQL DML statements, including DELETE, INSERT, UPDATE, and MERGE
  • Experience with or knowledge of data engineering practices on cloud platforms, including cloud features such as virtual machines, object storage, identity management, and metastores
  • Basic familiarity with Python variables, functions, and control flow (preferred)

Agenda

Day 1

  • Delta Lake
  • Relational entities on Databricks
  • ETL with Spark SQL
  • Incremental data processing with Structured Streaming and Auto Loader

Day 2

  • Medallion architecture in the data lakehouse
  • Delta Live Tables
  • Task orchestration with Databricks Jobs
  • Databricks SQL
  • Managing Permissions in the lakehouse
  • Productionizing dashboards and queries on Databricks SQL
 

Get in touch to schedule training for your team
We can enroll multiple students in an upcoming class or schedule a dedicated private training event designed to meet your organization’s needs.

 



Do You Have Additional Questions? Please Contact Us Below.

contact us contact us 
Contact Us about Starting Your Business Training Strategy with New Horizons