Udemy Business

Teach on Udemy

Turn what you know into an opportunity and reach millions around the world.

Learn More

Your cart is empty.

Keep shopping

Big Data Code Changes for Full Stack Simulation Engine 2026

Fix code to Full Stack Simulation Engine Refresher course April 2026

Created byShivgan Joshi

Last updated 4/2026

English

What you'll learn

Big Data Fundamentals for Full Stack Simulations
File Code Setting Structure of Big Data Implementation
YAML PY CSV and other files used - Refresher for git and pycharm
Jenkins Build and PySpark runs and Errors

Course content

14 sections • 50 lectures • 58m total length

Introduction to the Course1:54
Video 1 Intro for the Lecture2:34
Operate on large production systems in a manager-less big data environment using PySpark, Hadoop, Git, and Jenkins, with sampling, yaml-csv-python pipelines, and logistic regression for model evaluation.

Revision of 101 course0:38
Video2:02
Reading Emails - sent inbox and from people involved0:26
Connect to remote systems0:02
Intro video for independent working3:09
Quiz
Tools to connect anydesk and hoptodesk1:25
Explore a practical setup for remote connections by prioritizing AnyDesk, HopToDesk, and RustDesk, while using TeamViewer to manage IDs and login details; assess image quality, stability, and network considerations.
Video on Email0:54
Search outlook and personal email across inbox, folders, and network drives for tasks assigned to you. Check last 24 hours or 7 days and note jira or wiki references.
Video on remote systems, bride, client vmware, edge and hdfs3:10
Learn the five-layer remote access workflow from bridge to horizon VMware to master node, connecting to hdfs, using AnyDesk, HopToDesk, TeamViewer, and RustDesk.
Video on Confluence wiki page1:37
Learn to navigate the Confluence wiki, filter tasks by date modified, and reference team notes and screenshots to complete wiki-based tasks for the big data full-stack simulation engine.
Video on how to search videos1:19
Explore three ways to locate videos: by individual video, by playlist, or via an Excel sheet, and how editors rename and link videos in private YouTube playlists for gcp workflows.

Independent Working without any supervision - See wikipedia, videos, emails0:13
Reading Wiki Notes and updating notes0:24
Understating Videos library to proceed with the role0:13
How to use Copilot for Errors or understanding codes0:10
Learning tools is very imp, putty, winscp, others0:03
Anydesk Hoptodesk two imp tools1:25
Compare remote access tools for a full stack simulation engine. Prioritize AnyDesk, HopToDesk, and RustDesk, and use TeamViewer to retrieve IDs and passwords for stable connections.
Understanding the tech stack for Running Notebook on Master for HDFS data2:44
Sample Video for Reference -Not exactly what we want but close enough.5:13

VS Code and copilot venv git setttings0:16
Pycharm Pytest Venv Copilot0:11
Video on copilot vscode and pycharm intro1:57
Learn to use copilot in Visual Studio Code and PyCharm, compare community and paid editions, and understand edge node deployment with a bridge Ubuntu computer connecting Windows clients to Hadoop.
video on putty and winscp1:44
Connect a Windows client to the edge node using PuTTY for SSH and WinSCP for file transfer, with SSH keys and wiki-backed configuration notes.

Requirements

No experience needed

Description

Big Data Code Changes for the Full‑Stack Simulation Engine

Objectives and Key Tasks

Fix and refactor code for the Full‑Stack Simulation Engine to support different simulation runs.
Understand the end‑to‑end setup and architecture of the full‑stack simulation engine, including:
- Source code
- Configuration settings
- Data inputs
- PySpark pipelines
- Grid and execution environments
Run and debug PySpark code both locally and in the target (shared or distributed) environment.
Fix and update YAML configuration files and the associated Python files, and clearly understand all inputs and parameters used in the YAML files.
Configure PyCharm for effective local development and testing, especially local testing using small sample datasets.
Modify and extend YAML parameters as required by new logic, experiments, or use cases.
Fix variable definitions, naming issues, and scope inconsistencies — this is the primary responsibility of the role.
Understand regression modeling concepts used in the simulation — this is to understand the larger picture, not to redesign models.
Understand variable definitions, dependencies, and end‑to‑end data flow within the simulation model — this is required to ensure correctness and reproducibility.

Technical Setup

Configure the local development environment in PyCharm.
Run PySpark jobs for simulation workloads.
Validate YAML‑driven configuration pipelines and ensure correct parameter propagation.

Code & Configuration

Refactor full‑stack simulation engine code for clarity, correctness, and reusability.
Debug Python and YAML integration issues.
Correct variable definitions, parameter mappings, and configuration usage.

Modeling & Logic

Understand regression modeling techniques used in the simulations.
Analyze how variables impact model outputs across different runs.
Ensure correctness, consistency, and repeatability across simulation executions.

Configs for Running Notebook on Grids – different venv, auth, types of auth, errors

Setting up git SSH on masters and local system – git clone, git push, and common errors like matching ticket name or just restart and soft reset

Making the code work locally then on master systems – local on 5k, recreating sample as needed

Changing code on Copilot for different filtering on big data notebooks running on masters which have access to HDFS

Writing notes in MD and also using Copilot

Handshaking CSVs for different repos – CSVs act as a bridge – also understanding CSV definition

Understanding errors that can happen in multithreading, hard to locate for PySpark local setup

Starting and running afresh from git clone, auth, and the notebook on master systems

Git clone locally, installing venv locally, and then running unit tests on local data

Manual setting in PyCharm for making local unit tests work

Creating samples that will work with different kinds of tests locally, like 5k local files

First make the current unit tests work – settings like no coverage, others in pytest.ini

Adding new variables for local tests

Who this course is for:

Beginners in Python Big Data for Full stack simulation

Big Data Code Changes for Full Stack Simulation Engine 2026

What you'll learn

Explore related topics

Course content

Introduction2 lectures • 4min

Remote Independent Working in a Managerless Environment 10210 lectures • 15min

Understanding Client Requirements - Search for work yourself - take owernship8 lectures • 10min

Setting up copilot, editors, venvs, other items4 lectures • 4min

Revision on Pyspark, Git commands and Jenkins build4 lectures • 7min

Actual Task - most imp video2 lectures • 14min

Files Involved5 lectures • 1min

HDFS/Slave vs Master vs Local Machine2 lectures • 1min

Pyspark runs4 lectures • 1min

Handshaking between different simulations2 lectures • 1min

Requirements

Description

Who this course is for: