⚡ Vibranium | Network Telemetry Lab
ASUS ROG Strix G16 • AMD Ryzen 9 • 32GB RAM • Broker-Centric Architecture
🖥️ Vibranium — Physical Host (ASUS ROG Strix G16 G614PP-MS96)
AMD Ryzen 9 8940HX
2.4GHz (boost 5.2GHz) • 16 cores / 32 threads
32GB DDR5-5200
Dual-channel • Sufficient for 5+ VMs
NVIDIA RTX 5070 8GB
GPU not used for networking lab
1TB NVMe SSD
Fast I/O for VM snapshots
🧠 How vCPUs relate to physical cores & threads (SMT)
AMD Ryzen 9 8940HX: 16 physical cores, each supports Simultaneous Multi-Threading (SMT) → 32 logical processors (threads).
When you assign a vCPU to a VM in VMware Workstation, it schedules execution time across physical host threads.
Rule of thumb: Total assigned vCPUs should not exceed total host threads (32). Overcommit is possible but may cause latency.
For low-latency telemetry lab: Assign vCPUs carefully — network devices rarely need more than 2 vCPUs each.
When you assign a vCPU to a VM in VMware Workstation, it schedules execution time across physical host threads.
Rule of thumb: Total assigned vCPUs should not exceed total host threads (32). Overcommit is possible but may cause latency.
For low-latency telemetry lab: Assign vCPUs carefully — network devices rarely need more than 2 vCPUs each.
⚙️ Virtual CPU Allocation — Best Practices for this Lab
| VM | vCPUs | Why this allocation |
|---|---|---|
| jeannie (Fedora / Tools) | 4 | Runs Kafka, Prometheus, Grafana, Postgres, 2x Telegraf — moderate concurrency |
| leanna (Rocky / Ansible) | 2 | Control node, playbook execution, lightweight |
| sai (Arista vEOS) | 2 | Network OS, control plane overhead minimal |
| emias (Cisco vIOS) | 2 | Classic IOS lightweight |
| milána (Juniper vJunos) | 2 | Junos requires 2 for stable telemetry |
| demiá (Cisco NX-OSv) | 2 | Data center switch simulation, gNMI ready |
📊 Total vCPUs assigned: 14 out of 32 host threads → comfortable headroom (no overcommit)
💡 VMware schedules vCPUs across physical cores; SMT allows efficient interleaving. Keep per-VM vCPU ≤ physical cores for latency-sensitive workloads.
🖥️ Virtual Machines — vibranium (VMware Workstation Pro)
jeannie
Fedora 43 | 4vCPU / 8GB
192.168.45.129
📦 Tools VM (Docker stack + broker)
leanna
Rocky Linux 9 | 2vCPU / 2GB
192.168.45.130
⚙️ Ansible Control Node
sai
Arista vEOS | 2vCPU / 4GB
192.168.45.131
🔄 gNMI + SNMP source
emias
Cisco vIOS | 2vCPU / 4GB
192.168.45.132
🔌 SNMP / NETCONF (IOS)
milána
Juniper vJunos | 2vCPU / 4GB
192.168.45.133
🌿 Junos telemetry / SNMP
demiá
Cisco NX-OSv | 2vCPU / 8GB
192.168.45.134
🏢 Data Center (gNMI ready)
Fedora (Tools)
Rocky (Ansible)
Arista
Cisco IOS
Juniper
Cisco NX-OS (Demiá)
🌐 Management Network — VMnet1 (Host‑only) 192.168.45.0/24
📡 jeannie
Fedora | Docker: Kafka, Prometheus, Grafana, Postgres, Telegraf
Fedora | Docker: Kafka, Prometheus, Grafana, Postgres, Telegraf
⚙️ leanna
Rocky Linux 9 | Ansible Core
Rocky Linux 9 | Ansible Core
🌀 sai
Arista vEOS (gNMI / SNMP)
Arista vEOS (gNMI / SNMP)
🔷 emias
Cisco vIOS (SNMP)
Cisco vIOS (SNMP)
🌿 milána
Juniper vJunos
Juniper vJunos
🏛️ demá
Cisco NX-OSv (Demiá)
Cisco NX-OSv (Demiá)
🔗 SSH from leanna → all VMs (Ansible automation)
📡 SNMP polling from jeannie → all network devices • gNMI future streams → Kafka
📐 Reference Architecture — 5-Layer Model (Traditional + Modern)
| Layer | Name | Traditional (Poll/SNMP) | Modern (gNMI/Streaming) |
|---|---|---|---|
| L5 | Data/Presentation | Grafana, Prometheus, PostgreSQL (jeannie) — long‑term storage | |
| L4 | Broker | ✅ Kafka on jeannie • topics: snmp.metrics, gnmi.metrics | |
| L3 | Tool | Telegraf (producer) • Ansible | gnmic, custom streaming adapters |
| L2 | Protocol | SNMPv2c, NETCONF, Syslog | gNMI, OpenConfig, MDT |
| L1 | Device | sai • emias • milána • demiá | |
📈 Data Pipeline — SNMP → Kafka → Prometheus → Grafana
sai / emias / milána / demiá
➡️ SNMP (30s)
Telegraf (producer)
➡️ topic
Kafka :9092
➡️ consumer
Telegraf (consumer)
➡️
Prometheus
➡️
Grafana 📊
📦 Kafka replay
➡️
PostgreSQL (timescale)
➕
future: ML / Splunk
⚙️ Infrastructure as Code — Ansible on leanna (Rocky Linux)
📁 ~/ansible-lab/
├── inventory/hosts.yml
├── playbooks/
│ ├── 01-configure-network.yml
│ ├── 02-validate-telemetry.yml
│ └── 03-copp-impact-test.yml
├── ansible.cfg
└── group_vars/
├── inventory/hosts.yml
├── playbooks/
│ ├── 01-configure-network.yml
│ ├── 02-validate-telemetry.yml
│ └── 03-copp-impact-test.yml
├── ansible.cfg
└── group_vars/
arista.eos cisco.ios junipernetworks.junos cisco.nxos
✅ Multi‑vendor automation (EOS, IOS, Junos, NX-OS)
✅ CoPP impact test playbook
✅ Multi‑vendor automation (EOS, IOS, Junos, NX-OS)
✅ CoPP impact test playbook
🚀 ansible-playbook playbooks/01-configure-network.yml – configures SNMP, gNMI, and streaming sensors across all devices.
🛡️ Control Plane Policing (CoPP) — Why Broker Wins
❌ Direct polling (10+ Python scripts)
Each script → separate SSH/show commands → device CPU spikes → CoPP drops packets → lost telemetry
→
✅ Broker‑mediated (single poll + Kafka fan-out)
One poll → Kafka streams to N consumers. Device CPU stable, zero CoPP drops.
🔌 Access & Services
| Service | URL / Command | Credentials |
|---|---|---|
| Grafana | http://192.168.45.129:3000 | admin/admin |
| Prometheus | http://192.168.45.129:9090 | — |
| SSH jeannie | ssh fedora@192.168.45.129 | user password |
| SSH leanna | ssh rocky@192.168.45.130 | Ansible control |
| SSH sai (Arista) | ssh admin@192.168.45.131 | admin/admin |
| SSH emias (IOS) | ssh cisco@192.168.45.132 | cisco/cisco |
| SSH milána (Junos) | ssh root@192.168.45.133 | (none) |
| SSH demiá (NX-OSv) | ssh admin@192.168.45.134 | admin/admin |
📜 Quickstart — Reproducible Deployment
# On leanna (Ansible control node) sudo dnf install -y epel-release ansible-core ansible-galaxy collection install arista.eos cisco.ios junipernetworks.junos cisco.nxos community.docker # Copy inventory & playbooks to ~/ansible-lab/ ansible-playbook playbooks/01-configure-network.yml # On jeannie (Tools VM) cd ~/telemetry-lab docker compose up -d # Verify metrics docker exec kafka kafka-console-consumer --topic snmp.metrics --bootstrap-server localhost:9092 --max-messages 3
