Deep Learning & Computer Vision Essentials (updated)

Modern CV bootcamp: go from OpenCV/NumPy to training, evaluating and deploying deep vision models.

Practice PyTorch-first workflows with optional TensorFlow/Keras parallels.

Cover CNNs and Vision Transformers with a practitioner’s focus.

Gain practical experience via ~70% labs on real datasets and projects.

How this helps: build reliable models and ship to edge with ONNX/TensorRT.

Who it’s for: designed for individuals with Python/C++ basics tackling perception problems.

Emphasis on reproducibility, robust metrics and responsible evaluation.

Curriculum

Computer vision essentials with OpenCV

Loading/saving, color spaces, drawing primitives, ROI/cropping
Transforms: resize/rotate/flip; image arithmetic; masking; channel ops
Kernels & morphology; gradients/edges; illumination issues and normalization
Mini-lab: lane-detection pipeline refresher

NumPy & array programming

ndarray basics, broadcasting, views vs copies
Vectorization, memory layout, dtype & precision trade-offs
Small lab: implement a convolution and compare to OpenCV

Neural networks from scratch (concepts)

Perceptron, activations, logits & softmax, cross-entropy
Backprop & gradient descent; initialization & normalization
Overfitting vs generalization; regularization overview

PyTorch essentials (with TensorFlow/Keras parallels)

Tensors, autograd, modules; writing a training loop
Data pipelines: Dataset/DataLoader, augmentations (Albumentations basics)
Metrics & evaluation; confusion matrix; reproducibility (seeds/determinism)
Mixed precision (AMP) and basic multi-GPU (DDP) — overview + demo

Convolutional networks, transfer learning & fine-tuning

CNN building blocks; receptive fields and shapes; parameter counting
Using pretrained backbones (torchvision/timm) and head design
Freeze/partial freeze; discriminative learning rates; early stopping
Lab: fine-tune a classifier; compare augmentation strategies

Detection/segmentation quick tour

Object detection families (one-stage vs two-stage) — practitioner’s view
Modern choices: YOLO-family, DETR/RT-DETR (overview), instance vs semantic segmentation
Dataset formats, annotation tools, and evaluation (mAP/IoU) basics

Attention & Vision Transformers (practical overview)

Self-attention intuition; patches & embeddings; positional encodings
ViT/DeiT-style fine-tuning workflow
When to pick CNNs vs ViTs; compute/memory considerations

Optimization: training recipes that matter

Schedulers (cosine/one-cycle), weight decay, label smoothing
Regularization: dropout, mixup/cutmix (overview)
Tips for stable training: gradient clipping, sane batch sizes, AMP pitfalls
Experiment tracking: MLflow or Weights & Biases (brief)

Responsible CV & data quality

Dataset curation and splits; leakage & shortcuts
Bias, robustness, augmentations vs distribution shift
Documentation of experiments and model cards (lightweight)

Deploying to edge and production

Export: TorchScript and ONNX; verifying numerical parity
Acceleration: TensorRT / ONNX Runtime / OpenVINO — when to use what
Quantization (PTQ/QAT), pruning & distillation — practical gains & trade-offs
Serving options: Triton Inference Server; Jetson deployment notes

Optional modules

Optional — RNN/temporal models & tracking

Temporal modeling options (RNN/LSTM/GRU vs 1D temporal convs)
Basics of multi-object tracking (overview)

Course Day Structure

Part 1: 09:00–10:30
Break: 10:30–10:45
Part 2: 10:45–12:15
Lunch break: 12:15–13:15
Part 3: 13:15–15:15
Break: 15:15–15:30
Part 4: 15:30–17:30

Why Edocti?

The trainers: the most obvious reason. We love what we do and share the knowledge we build in day-to-day practice.
Relevant content: customized to the engineering team’s real, day-to-day needs.
Hands-on first: all our courses are practical. We don’t believe in "slide courses". Our programs are roughly 70% hands-on and 30% focused theory. We present concepts while working through real-world examples.
Edocti has been working on Automotive projects since 2016.
Our trainers have 11+ years of Automotive experience across Classic AUTOSAR, Adaptive AUTOSAR, SOME/IP, embedded Linux, QNX, INTEGRITY, ISO 26262, etc., on telematics and AD/ADAS programs.
Autonomous Driving and ADAS projects for Volvo and General Motors (end clients) in collaboration with Tier-1 companies in Romania since 2016.
V2X projects for GM and VW (end clients), in collaboration with Tier-1 companies in Romania.
Strong collaboration with Romanian Tier-1 companies for technical training, consulting, architecture and implementation.

Who Should Attend

Engineers who solve perception problems in Python/C++ and want an up-to-date, practical path from classic CV to deep learning—especially for embedded/edge deployment.

Required Infrastructure

Laptop (Windows/macOS/Linux). For GPU labs, a CUDA-capable NVIDIA GPU or a cloud VM with GPU (e.g., EC2 with recent NVIDIA GPUs). We provide ready-to-run containers/notebooks; Jetson boards optional for edge deployment.

Deep Learning & Computer Vision Essentials (updated)

Curriculum

Optional modules

Course Day Structure

Want to find out more? We are here to help!