I have a list of network devices I need to SSH into in order to get the output of a command. I'm using Netmiko (Python). The SSH login credentials of each "login request" are JSON objects sent as an array through a single POST request to my Python server:
[{ IP: ..., CMD: ... }, { IP: ..., CMD: ... }, ..., { IP: ..., CMD: ... }]
So far, my thought process for this has been:
- Group the requests by host IP (many of them belong to the same host)
- Process each request from the same host in a multithreaded fashion.
- Group together the outputs from each request and return.
And this is what I've implemented:
from fastapi import FastAPI, Request, Bodyfrom netmiko import ConnectHandlerfrom pydantic import BaseModelimport loggingimport jsonfrom typing import Anyimport uvicornimport concurrent.futures format = "%(asctime)s: %(message)s"logging.basicConfig(format=format, filename='CNM.log', encoding='utf-8', level=logging.INFO)app = FastAPI()def handle_request(payload): try: device = {'device_type': payload['dev_vendor'],'host': payload['IP'],'username': 'user','password': 'password' } conn = ConnectHandler(**device) conn.read_timeout_override = 20 output = conn.send_command(payload['CMD']) conn.disconnect() return output except Exception as e: logging.error(e)@app.post("/ssh_login/")def ssh_login(reqs: Any = Body(None)): hosts_list = [] for req in reqs: hosts_list.append(req['IP']) hosts_set = set(hosts_list) responses = [] with concurrent.futures.ThreadPoolExecutor() as executor: for host in hosts_set: host_reqs = [] # Get the host's requests for req in reqs: if req['IP'] == host: host_reqs.append(req) for output in executor.map(handle_request, host_reqs): responses.append(output) return responses
Although faster than a conventional synchronous approach, it turns out to be slow IMO, ~1s per request.
I thought that opening a single SSH connection for a host and running all commands in parallel with the executor could solve the problem, but I researched and it seems not viable (looks like sharing the same SSH connection across threads would be problematic).
Is there a way to optimize this and gain performance?