Linux Virtual Background Without Green Screen
Open up Source Virtual Background
:linux: 🎥
April ninth, 2020
With many of us around the world under shelter in place due to COVID-19 video calls have become a lot more mutual. In item, ZOOM has controversially become very popular. Arguably Zoom's near interesting feature is the "Virtual Background" back up which allows users to replace the background behind them in their webcam video feed with any image (or video).
I've been using Zoom for a long fourth dimension at piece of work for Kubernetes open source meetings, normally from my visitor laptop. With daily "work from home" I'm now inclined to use my more than powerful and ergonomic personal desktop for some of my open source piece of work.
Unfortunately, Zoom's linux client only supports the "chroma-key" A.Grand.A. "greenish screen" background removal method. This method requires a solid colour backdrop, ideally a light-green screen with uniform lighting.
Since I do non have a green screen I decided to simply implement my ain groundwork removal, which was obviously ameliorate than cleaning my apartment or just using my laptop all the fourth dimension. 😀
Information technology turns out we tin can really become pretty decent results with off the shelf, open source components and just a footling of our own code.
Reading The Camera
First thing'southward first: How are nosotros going to become the video feed from our webcam for processing?
Since I use Linux on my personal desktop (when non playing PC games) I chose to use the OpenCV python bindings as I'k already familiar with them and they include useful paradigm processing primitives in addition to V4L2 bindings for reading from webcams.
Reading a frame from the webcam with python-opencv
is very simple:
1 import cv2 ii cap = cv2 . VideoCapture ( '/dev/video0' ) 3 success , frame = cap . read ()
For better results with my photographic camera before capturing set:
i # configure photographic camera for 720p @ 60 FPS 2 height , width = 720 , 1280 3 cap . set up ( cv2 . CAP_PROP_FRAME_WIDTH , width ) four cap . gear up ( cv2 . CAP_PROP_FRAME_HEIGHT , top ) 5 cap . set up ( cv2 . CAP_PROP_FPS , 60 )
Most video conferencing software seems to cap video to 720p @ 30 FPS or lower, just we won't necessarily read every frame anyhow, this sets an upper limit.
Put the frame capture in a loop and we've got our video feed!
ane while True : ii success , frame = cap . read ()
We tin save a test frame with merely:
1 cv2 . imwrite ( "examination.jpg" , frame )
And now nosotros can see that our camera works. Success!
Finding The Background
OK, now that we take a video feed, how exercise we identify the background so we can replace it? This is the catchy office …
While Zoom doesn't seem to take commented anywhere virtually how they implemented this, the way it behaves makes me suspect that a neural network is involved, it's difficult to explain just the results wait like 1. Additionally, I establish an article virtually Microsoft Teams implementing groundwork blur with a convolutional neural network.
Creating our ain network wouldn't be as well difficult in principle – There are many articles and papers on the topic of image partitioning and plenty of open source libraries and tools, but we demand a fairly specialized dataset to go practiced results.
Specifically we'd demand lots of webcam similar images with the platonic man foreground marked pixel past pixel versus the background.
Edifice this sort of dataset in preparation for training a neural internet probably would exist a lot of work. Thankfully a research team at Google has already done all of this hard work and open sourced a pre-trained neural network for "person segmentation" called BodyPix that works pretty well! ❤️
BodyPix is currently only available in TensorFlow.js form, then the easiest style to use it is from the body-pix-node library.
To go faster inference (prediction) in the browser a WebGL backend is preferred, but in node nosotros can employ the Tensorflow GPU backend (NOTE: this requires a NVIDIA Graphics Card, which I have).
To make this easier to setup, we'll first past setting up a pocket-size containerized tensorflow-gpu + node environment / project. Using this with nvidia-docker is much easier than getting all of the correct dependencies setup on your host, information technology simply requires docker and an up-to-appointment GPU driver on the host.
ane { 2 "name" : "bodypix" , 3 "version" : "0.0.1" , iv "dependencies" : { 5 "@tensorflow-models/body-pix" : "^2.0.5" , 6 "@tensorflow/tfjs-node-gpu" : "^1.7.1" 7 } eight }
bodypix/DockerfileDockerfile
ane # Base of operations image with TensorFlow GPU requirements ii FROM nvcr.io/nvidia/cuda:ten.0-cudnn7-runtime-ubuntu18.04 iii # Install node 4 RUN apt update && apt install -y whorl make build-essential \ 5 && curlicue -sL https://deb.nodesource.com/setup_12.x | bash - \ 6 && apt-get -y install nodejs \ 7 && mkdir /.npm \ 8 && chmod 777 /.npm 9 # Ensure we tin get enough GPU retentiveness x # Unfortunately tfjs-node-gpu exposes no gpu configuration :( xi ENV TF_FORCE_GPU_ALLOW_GROWTH = true 12 # Install node bundle dependencies thirteen WORKDIR /src xiv Copy package.json /src/ 15 RUN npm install sixteen # Setup our app equally the entrypoint 17 COPY app.js /src/ 18 ENTRYPOINT node /src/app.js
Now to serve the results… WARNING: I am not a node expert! This is merely my quick evening hack, bear with me :-)
The following simple script replies to an HTTP Postal service
ed prototype with a binary mask (an 2d array of binary pixels, where nil pixels are the groundwork).
1 const tf = require ( '@tensorflow/tfjs-node-gpu' ); ii const bodyPix = crave ( '@tensorflow-models/body-pix' ); 3 const http = require ( 'http' ); 4 ( async () => { five const cyberspace = await bodyPix . load ({ six architecture : 'MobileNetV1' , vii outputStride : 16 , 8 multiplier : 0.75 , 9 quantBytes : 2 , 10 }); 11 const server = http . createServer (); 12 server . on ( 'request' , async ( req , res ) => { thirteen var chunks = []; 14 req . on ( 'information' , ( chunk ) => { 15 chunks . push ( clamper ); sixteen }); 17 req . on ( 'end' , async () => { xviii const image = tf . node . decodeImage ( Buffer . concat ( chunks )); 19 segmentation = await net . segmentPerson ( image , { twenty flipHorizontal : false , 21 internalResolution : 'medium' , 22 segmentationThreshold : 0.7 , 23 }); 24 res . writeHead ( 200 , { 'Content-Type' : 'awarding/octet-stream' }); 25 res . write ( Buffer . from ( partition . data )); 26 res . end (); 27 tf . dispose ( paradigm ); 28 }); 29 }); 30 server . listen ( 9000 ); 31 })();
We can employ numpy and requests to convert a frame to a mask from our python script with the following method:
ane def get_mask ( frame , bodypix_url = 'http://localhost:9000' ): ii _ , information = cv2 . imencode ( ".jpg" , frame ) three r = requests . post ( 4 url = bodypix_url , 5 data = information . tobytes (), half-dozen headers = { 'Content-Blazon' : 'application/octet-stream' }) 7 # catechumen raw bytes to a numpy array 8 # raw data is uint8[width * pinnacle] with value 0 or ane nine mask = np . frombuffer ( r . content , dtype = np . uint8 ) 10 mask = mask . reshape (( frame . shape [ 0 ], frame . shape [ 1 ])) 11 return mask
Which gives u.s. a upshot something similar:
While I was working on this, I spotted this tweet:
This is definitely the Best background for video calls. 💯
— Ashley Willis (McNamara) (@ashleymcnamara) Apr 2, 2020
At present that we have the foreground / groundwork mask, it will be piece of cake to supplant the background.
Subsequently grabbing the awesome "Virtual Background" flick from that twitter thread and cropping information technology to a 16:ix ratio epitome …
… we can do the following:
1 # read in a "virtual background" (should be in 16:9 ratio) 2 replacement_bg_raw = cv2 . imread ( "groundwork.jpg" ) iii 4 # resize to match the frame (width & height from before) 5 width , height = 720 , 1280 6 replacement_bg = cv2 . resize ( replacement_bg_raw , ( width , height )) 7 viii # combine the background and foreground, using the mask and its inverse 9 inv_mask = 1 - mask 10 for c in range ( frame . shape [ 2 ]): xi frame [:,:, c ] = frame [:,:, c ] * mask + replacement_bg [:,:, c ] * inv_mask
Which gives usa:
The raw mask is clearly not tight enough due to the operation trade-offs we fabricated with our BodyPix parameters but .. so far so good!
This background gave me an idea …
Making Information technology Fun
At present that we take the masking done, what tin nosotros do to make it look better?
The get-go obvious stride is to smooth the mask out, with something like:
ane def post_process_mask ( mask ): ii mask = cv2 . dilate ( mask , np . ones (( 10 , x ), np . uint8 ) , iterations = 1 ) 3 mask = cv2 . erode ( mask , np . ones (( 10 , 10 ), np . uint8 ) , iterations = 1 ) iv return mask
This tin can help a bit, but it's pretty minor and only replacing the background is a little irksome, since we've hacked this upwardly ourselves we can practice anything instead of just a basic groundwork removal …
Given that we're using a Star Wars "virtual groundwork" I decided to create hologram consequence to fit in improve. This as well lets lean into blurring the mask.
First update the post processing to:
ane def post_process_mask ( mask ): 2 mask = cv2 . amplify ( mask , np . ones (( ten , ten ), np . uint8 ) , iterations = ane ) 3 mask = cv2 . blur ( mask . astype ( float ), ( xxx , xxx )) four return mask
Now the edges are blurry which is good, just nosotros need to starting time edifice the hologram effect.
Hollywood holograms typically have the post-obit properties:
- washed out / monochromatic color, as if done with a bright laser
- browse lines or a grid similar effect, equally if many beams created the paradigm
- "ghosting" as if the projection is washed in layers or imperfectly reaching the correct distance
We can add these step by step.
First for the blueish tint we but demand to apply an OpenCV colormap:
1 # map the frame into a blue-greenish colorspace 2 holo = cv2 . applyColorMap ( frame , cv2 . COLORMAP_WINTER )
Then we can add the scan lines with a halftone-like outcome:
ane # for every bandLength rows darken to ten-xxx% brightness, 2 # then don't touch for bandGap rows. 3 bandLength , bandGap = 2 , 3 4 for y in range ( holo . shape [ 0 ]): five if y % ( bandLength + bandGap ) < bandLength : half dozen holo [ y ,:,:] = holo [ y ,:,:] * np . random . compatible ( 0.1 , 0.3 )
Side by side we tin add some ghosting by adding weighted copies of the current upshot, shifted forth an axis:
i # shift_img from: https://stackoverflow.com/a/53140617 2 def shift_img ( img , dx , dy ): iii img = np . roll ( img , dy , axis = 0 ) four img = np . roll ( img , dx , axis = 1 ) five if dy > 0 : vi img [: dy , :] = 0 vii elif dy < 0 : viii img [ dy :, :] = 0 9 if dx > 0 : 10 img [:, : dx ] = 0 eleven elif dx < 0 : 12 img [:, dx :] = 0 thirteen return img 14 15 # the first ane is roughly: holo * 0.ii + shifted_holo * 0.8 + 0 sixteen holo2 = cv2 . addWeighted ( holo , 0.two , shift_img ( holo1 . re-create (), 5 , v ), 0.viii , 0 ) 17 holo2 = cv2 . addWeighted ( holo2 , 0.4 , shift_img ( holo1 . re-create (), - 5 , - 5 ), 0.6 , 0 )
Last: We'll want to go on some of the original colour, and then let's combine the holo upshot with the original frame like to how nosotros added the ghosting:
1 holo_done = cv2 . addWeighted ( img , 0.5 , holo2 , 0.6 , 0 )
A frame with the hologram effect at present looks similar:
On it'due south ain this looks pretty 🤷
Just combined with our virtual background information technology looks more like:
There we get! 🎉 (I promise it looks libation with motion / video 🙃
Outputting Video
Now we're just missing one thing … We can't actually use this in a call yet.
To set that, we're going to use pyfakewebcam and v4l2loopback to create a false webcam device.
We're also going to really wire this all upward with docker.
Get-go create a requirements.txt
with our dependencies:
fakecam/requirements.txtDockerfile
i numpy ==1.18.2 two opencv-python==4.2.0.32 3 requests ==2.23.0 4 pyfakewebcam ==0.1.0
And and so the Dockerfile
for the fake camera app:
fakecam/DockerfileDockerfile
1 FROM python:3-buster ii # ensure pip is up to appointment iii RUN pip install --upgrade pip 4 # install opencv dependencies 5 RUN apt-get update && \ six apt-get install -y \ 7 ` # opencv requirements` \ 8 libsm6 libxext6 libxrender-dev \ 9 ` # opencv video opening requirements` \ x libv4l-dev xi # install our requirements 12 WORKDIR /src xiii Re-create requirements.txt /src/ 14 RUN pip install --no-cache-dir -r /src/requirements.txt 15 # copy in the virtual groundwork 16 Re-create background.jpg /data/ 17 # run our false photographic camera script (with unbuffered output for easier debug) 18 COPY false.py /src/ 19 ENTRYPOINT python -u fake.py
We're going to need to install v4l2loopback
from a crush:
ane sudo apt install v4l2loopback-dkms
And and then configure a fake camera device:
one sudo modprobe -r v4l2loopback 2 sudo modprobe v4l2loopback devices = 1 video_nr = xx card_label = "v4l2loopback" exclusive_caps = one
We need the exclusive_caps
setting for some apps (chrome, zoom) to work, the label is merely for our convenience when selecting the camera in apps, and the video number just makes this /dev/video20
if bachelor, which is unlikely to be already in use.
At present we can update our script to create the imitation camera:
1 # once again use width, height from before 2 fake = pyfakewebcam . FakeWebcam ( '/dev/video20' , width , acme )
We as well need to annotation that pyfakewebcam
expects images in RGB
(red, greenish, blueish) while our OpenCV operations are in BGR
(blueish, greenish, blood-red) aqueduct order.
We can fix this earlier outputting and then send a frame with:
i frame = cv2 . cvtColor ( frame , cv2 . COLOR_BGR2RGB ) 2 faux . schedule_frame ( frame )
All together the script looks like:
1 import os 2 import cv2 3 import numpy equally np 4 import requests 5 import pyfakewebcam 6 7 def get_mask ( frame , bodypix_url = 'http://localhost:9000' ): 8 _ , data = cv2 . imencode ( ".jpg" , frame ) 9 r = requests . mail ( 10 url = bodypix_url , eleven data = data . tobytes (), 12 headers = { 'Content-Type' : 'application/octet-stream' }) 13 mask = np . frombuffer ( r . content , dtype = np . uint8 ) 14 mask = mask . reshape (( frame . shape [ 0 ], frame . shape [ 1 ])) 15 return mask 16 17 def post_process_mask ( mask ): 18 mask = cv2 . dilate ( mask , np . ones (( 10 , x ), np . uint8 ) , iterations = one ) 19 mask = cv2 . mistiness ( mask . astype ( float ), ( 30 , 30 )) 20 return mask 21 22 def shift_image ( img , dx , dy ): 23 img = np . curl ( img , dy , axis = 0 ) 24 img = np . gyre ( img , dx , axis = 1 ) 25 if dy > 0 : 26 img [: dy , :] = 0 27 elif dy < 0 : 28 img [ dy :, :] = 0 29 if dx > 0 : 30 img [:, : dx ] = 0 31 elif dx < 0 : 32 img [:, dx :] = 0 33 return img 34 35 def hologram_effect ( img ): 36 # add a bluish tint 37 holo = cv2 . applyColorMap ( img , cv2 . COLORMAP_WINTER ) 38 # add together a halftone issue 39 bandLength , bandGap = 2 , iii 40 for y in range ( holo . shape [ 0 ]): 41 if y % ( bandLength + bandGap ) < bandLength : 42 holo [ y ,:,:] = holo [ y ,:,:] * np . random . uniform ( 0.1 , 0.iii ) 43 # add some ghosting 44 holo_blur = cv2 . addWeighted ( holo , 0.ii , shift_image ( holo . copy (), 5 , 5 ), 0.8 , 0 ) 45 holo_blur = cv2 . addWeighted ( holo_blur , 0.4 , shift_image ( holo . copy (), - five , - five ), 0.6 , 0 ) 46 # combine with the original color, oversaturated 47 out = cv2 . addWeighted ( img , 0.5 , holo_blur , 0.six , 0 ) 48 return out 49 50 def get_frame ( cap , background_scaled ): 51 _ , frame = cap . read () 52 # fetch the mask with retries (the app needs to warmup and we're lazy) 53 # e v e n t u a fifty l y c o north s i south t due east n t 54 mask = None 55 while mask is None : 56 try : 57 mask = get_mask ( frame ) 58 except requests . RequestException : 59 print ( "mask request failed, retrying" ) 60 # postal service-process mask and frame 61 mask = post_process_mask ( mask ) 62 frame = hologram_effect ( frame ) 63 # composite the foreground and background 64 inv_mask = 1 - mask 65 for c in range ( frame . shape [ 2 ]): 66 frame [:,:, c ] = frame [:,:, c ] * mask + background_scaled [:,:, c ] * inv_mask 67 render frame 68 69 # setup access to the *real* webcam 70 cap = cv2 . VideoCapture ( '/dev/video0' ) 71 height , width = 720 , 1280 72 cap . set ( cv2 . CAP_PROP_FRAME_WIDTH , width ) 73 cap . set up ( cv2 . CAP_PROP_FRAME_HEIGHT , pinnacle ) 74 cap . set up ( cv2 . CAP_PROP_FPS , 60 ) 75 76 # setup the fake camera 77 fake = pyfakewebcam . FakeWebcam ( '/dev/video20' , width , height ) 78 79 # load the virtual background 80 background = cv2 . imread ( "/information/background.jpg" ) 81 background_scaled = cv2 . resize ( background , ( width , peak )) 82 83 # frames forever 84 while True : 85 frame = get_frame ( cap , background_scaled ) 86 # fake webcam expects RGB 87 frame = cv2 . cvtColor ( frame , cv2 . COLOR_BGR2RGB ) 88 fake . schedule_frame ( frame )
Now build the images:
i docker build -t bodypix ./bodypix 2 docker build -t fakecam ./fakecam
And run them like:
1 # create a network ii docker network create --driver bridge fakecam 3 # starting time the bodypix app iv docker run -d \ 5 --name=bodypix \ half-dozen --network=fakecam \ seven -p 9000:9000 \ 8 --gpus=all --shm-size=1g --ulimit memlock =-1 --ulimit stack = 67108864 \ 9 bodypix 10 # commencement the photographic camera, annotation that we demand to laissez passer through video devices, 11 # and we want our user ID and group to have permission to them 12 # you may need to `sudo groupadd $USER video` 13 docker run -d \ 14 --name=fakecam \ 15 --network=fakecam \ 16 -u " $(id -u) : $(getent group video | cut -d: -f3) " \ 17 $(notice /dev -proper noun 'video*' -printf "--device %p " ) \ 18 fakecam
Now make certain to offset this before opening the camera with any apps, and exist sure to select the "v4l2loopback" / /dev/video20
camera in Zoom etc.
The Finished Outcome
Here's a quick prune I recorded of this in action:
Look! I'yard dialing in from the Millennium Falcon with an open up source camera stack!
I'm pretty happy with how this came out. I'll definitely be joining all of my meetings this way in the morning. 😀
Source: https://elder.dev/posts/open-source-virtual-background/
0 Response to "Linux Virtual Background Without Green Screen"
Post a Comment