Showing posts with label Ubuntu. Show all posts
Showing posts with label Ubuntu. Show all posts

Sunday, March 04, 2018

My very own streaming TV channel - Avengers TV!

I recently returned from a Disney Cruise, and one of the best things about it (for me anyway) was the fact that Disney has a ship-board TV channel that streams Marvel's Avengers movies 24x7. (No, I did not spend the entire cruise inside my stateroom -- I'm not THAT lame. But it was nice that for the moments when I was in my stateroom, there was awesome TV, guaranteed. And the cruise was awesome, even without the TV. That was just icing.) Even before the cruise was over, I started thinking... "How can I do this at home? It shouldn't be too hard to do." And while that was generally true, the Devil (as always) is in the details. Here's what I learned. And it should be easy enough to reproduce now, without all the busted knuckles…

DISCLAIMER:  There is potentially a murky legal issue here.  The laws in the US are a bit contradictory.  You DO have the right to make copies of copyrighted material for private, archival use (say, a backup copy if your source DVD fails).  That said, circumventing copy protection on a DVD or BluRay will put you at odds with the Digital Millennium Copyright Act.  So, what happens when a media provider denies you your legal right to make an archival copy of media you have legally purchased?  I don't know...

Hardware Requirements

Start with a small PC that supports 4K video. I tried a Raspberry Pi, but VLC doesn’t support hardware acceleration on that platform (without compiling from source, anyway), and my experience with the Pi in this case was rather poor. The streams were terribly choppy at “HD” quality.

On the other hand, I found a older NUC with a Celeron processor and a nice GPU that works like a champ. I outfitted a NUC5CPYH with 8GB RAM and a 250GB SATA3 SSD and got excellent results. Clearly not as cheap as a Pi, but in the grand scheme, it’s not too steep a price to pay. For the minimal Ubuntu load we’re going to do, 250GB is a lot of storage for video, and you might not need nearly that much. My video library is full of 2-hour+ videos at 1920x1080 with 5.1 audio, and they run between 3-4GB each. So 24 hours of video would run less than 50GB at that rate. Plus, I’m storing all my video on a Synology DS416 NAS. The SSD I used is one I actually re-used from another machine. Something like 120GB would have been plenty, so don’t feel compelled to go big, even if you’re serving the video locally.

Other things to consider are the quality of your networking gear. I run a pair of Juniper EX2300-C switches and Ubiquiti UniFi AC-Pro wireless access points in my house. I can’t speak for how well “big box store” networking gear will work. The actual video stream is not huge (maybe 1.5-2Mbps), but this setup makes use of multicast to distribute the service. So make sure you’re networking gear provides support for it — look for IGMP and IGMP snooping support -- and look for class/quality of service support as well. Anyway, something to consider…

Operating System

I built my setup using Ubuntu Server 14.04.05. You’re probably asking why. Why Ubuntu, and why such an old release? Answering in order, first, Ubuntu server is easy, or at least easy enough, to install and configure. And it’s quite light weight, especially compared against Windows, so it leaves lots of space for local video if you choose.

Now the next question: Why such an old release? The answer to that has to do with how VLC is packaged for use with Ubuntu. It seems that when Canonical packages software like VLC for an Ubuntu release, the software major release gets frozen with the OS major release. So the only version of VLC you can download for Ubuntu 14.04 via apt is VLC 2.1.x. Go to a newer (and still supported) release of Ubuntu, and you get a 2.2.x release of VLC. And it turns out that multicast streaming (or more correctly, streaming using the ts multiplexer, which is the only multiplexer supported for RTP streams) is broken in the 2.2.x releases. I know, I tested. And while the application will happily tell you it’s sending multicast packets, neither tcpdump nor external testing tools (Wireshark on a SPAN port) show any multicast data leaving the server. So go grab Ubuntu 14.04.latest from the download site. It will definitely do the job.

Additional Software

There are really just three post-install software packages we need to install on the server. The first, as we’ve seen, is VLC. In this case, since we’re running a very lean Ubuntu server (without a window manager), we’re going to install the command-line-only version of VLC. Along with VLC, we’re going to need some support for encoding and decoding audio/video, so there’s a nice package for that. Last, we’ll need a small package to announce the availability of our media stream to interested listeners on our network. Again, there’s a nice little package called minisapserver that will do the trick. Thus the complete list of add-on packages we’ll need are as follows:

  • vlc-nox
  • libavcodec-extra
  • minisapserver

That’s it! That’s all we need on the server.

Now, what about the clients, you ask? Simple enough, we need more VLC! And the great thing about VLC is that it exists for all kinds platforms and OS’es. There are Win/Mac/Linux versions for traditional computing. There are iOS and Android versions for mobile. Heck, there’s even a version for the XBOX One now, right in the Store! So go grab the client(s) you need while you’re at it.

Procedure

Now that we’ve identified the parts, how do we put them all together?

Configure the Hardware

Let’s start with building the server. First and foremost: If your BIOS has options to enable GPU handling of video mux and transcode operations, enable them now. I can’t tell you specifically where to look in your BIOS settings, every PC seems different, but do look. Do Google. Do whatever to find the option, if it exists, in your device. And if it does exist, turn it on before installing the OS, just to be safe.

Install and Configure Ubuntu

Next, it’s time to install Ubuntu on your server. Write the ISO file to your USB stick with your favorite tool (I use Etcher), and boot the server from the USB. I won’t go through all the options during install, in most cases you can just accept the default, but I do want to point out the following:

  1. When it comes to partitioning and formatting your disk, select the “Guided, use entire disk” option. Stay away from the LVM options. I’ve had lots of issues with LVM because Ubuntu doesn’t reserve enough space for the /boot volume. And it will fill up after a couple of kernel updates, if you’re not vigilant about auto-removing unused packages. So play it safe, and avoid LVM.
  2. You will need to create a user account to login and manage the box. The user you create during setup will have admin control of the server via the ‘sudo’ command. The root account is locked by default in Ubuntu. So create a user account you will remember, with an appropriately strong password, for admin purposes.
  3. When it comes time to select additional packages to install, you need only select the OpenSSH server. That will allow us to manage the box via SSH client, rather than needing a keyboard and monitor or the box to do so. So install OpenSSH server, and nothing else.

Once the install is complete, the server will reboot. When it does, test the login you created during the install, and bring the system up-to-date by issuing the following command from the cli:
sudo apt-get update -y && sudo apt-get dist-upgrade -y
When complete run this to clean out any unused packages:
sudo apt-get autoremove -y
By default, Ubuntu will configure your primary network device for DHCP. That’s nice and easy, for sure, but you may want something a bit more deterministic for managing the server via SSH. If you do want determinism, then you have two options. The first, and the one I would recommend, is to set up a static reservation for the server interface in your DHCP server. How to do that is an exercise left to the reader, and to Google. There are just too many variations in tools and environments to address every situation.

The second option is to configure a static IP address on your server interface. If that’s the path you choose, then your first step is to edit the file /etc/network/interfaces on the server. Use your editor of choice to make changes (nano is probably easiest for the uninitiated):
sudo nano /etc/network/interfaces
You will see a pair of lines similar to the following. Note that Ubuntu 14.04 tended to name Ethernet interfaces something like p2p1 or p128p0 -- just depends where on the PCI bus your adapter shows up. In my case, it’s p2p1:
iface p2p1 auto
iface p2p1 dhcp
Now replace that second line (iface p2p1 dhcp) with something akin to the following. (I assume you know what your IP address, netmask, and default gateway are supposed to look like):
iface p2p1 static
address 192.168.1.20
netmask 255.255.255.0
gateway 192.168.1.1
dns-nameservers 9.9.9.9
So that’s 5 lines to replace the ‘iface p2p1 dhcp’ line. Save your changes and close the file.

When you’ve completed all of that, go ahead and reboot the server. That will ensure all of your software updates take effect, as well as any changes you made to addressing your server.

Where is the Media?

Once the box is up and running again, it’s time to download and install the packages we’ll need to stream media. But before we do, consider where your media will be stored. Is it local -- on the server SSD? If so, then good. You can skip ahead to the next section.

If your media is resides on a network attached storage device (a NAS), then we’ll need to do a bit more server config to allow our server to mount shares from the NAS. And I’m assuming here that the NAS supports Windows-style (CIFS/SMB/etc.) shares. If that’s the case, you can follow these steps.

Create a mount point for the share on our server. I used /ds416/media on my server, so I ran the command:
sudo mkdir -p /ds416/media
In general, you can use whatever you want (so long as the directory name/structure isn’t already in use) and run the command:
sudo mkdir -p {{ path_to_my_media }}
Install the CIFS tools on the server
sudo apt-get install -y cifs-utils
Decide what credentials you want to pass to the NAS when you mount the share. (I created a read-only user on the NAS that I use for this purpose. I suggest this because our server will not need write access, and I hate open access shares. Lock it down a bit. Don’t set the bar on the ground for a potential hacker.)

Modify the contents of /etc/fstab to automount the share:
sudo nano /etc/fstab
Add the following line at the end of the file (this is all one line):
{{share_name}} {{media_dir}} cifs username={{user}},password={{password}},iocharset=utf8 0 0 

  • {{share_name}} is the path to the share on the server (e.g., //myServer/media)
  • {{media_dir}} is the mount point you created above (e.g., /ds416/media)
  • {{user}} and {{password}} are the credentials you created on the NAS to allow read-only access from the server (You can make things a bit more secure by moving the credentials to a .smbaccess file. Look here for details: https://wiki.ubuntu.com/MountWindowsSharesPermanently


Save changes and close.

You can mount your new share now by running the command ‘sudo mount -a’. And if the server reboots for any reason, it will mount automatically.

Install the Server Packages

Now that we have media on-hand to stream, it’s time to install the streaming tools. If you remember, we’ll need the vlc-nox, libavcodec-extra, and minisapserver packages. You can install all three with one command (all one line):
sudo apt-get install -y vlc-nox, libavcodec-extra, minisapserver
Boom, done!

You can test quickly by running ‘vlc --version’ from the command line. Be sure the version reported back is something from the 2.1 release. Mine is 2.1.6.

Create a Playlist

One of the nice things about VLC is that you can feed it a playlist as a source, and it will stream the media files referenced in the playlist. VLC supports M3U playlists, which are nice because they’re just formatted text files. An M3U playlist starts with this first line:
#EXTM3U
Then it contains a bunch of media entries that follow this format. Best to start with an example and then explain the fields:
#EXTINF: 8575, Marvel - The Avengers
#EXTVLCOPT: file-caching=300
The Avengers.m4v
Now the definitions…

  • 8575: The running time of the media, expressed in seconds
  • Marvel - The Avengers: The author and title of the media, separated by a hyphen/dash
  • The Avengers.m4v: The name of the media file on the disk/NAS. You can use a local/relative reference (as I did), or an absolute reference (such as /ds416/media/The Avengers.mv4).

Keep banging out those three lines for each media file you want to include in the playlist. When done, save the file with a .m3u extension (e.g., my-playlist.m3u) and close it out. It’s best to co-locate the playlist file with the actual media (put it in the same folder), as that will allow you to reliably use relative references for the media filenames in the playlist.

Plan the Multicast Environment

Here’s where we turn the network nerd-knobs. We need to know how to stream multicast from our server, so that: (a) we know how to receive it, and (b) we prevent potential multicast storms and/or data leakage. Things we need to set:

  • Multicast group (address): We’ll pick from the range 239.0.0.0/8 for private (administratively scoped) streams. Make it something (relatively) easy to remember, like 239.239.1.1. If you ever add a second stream, make it 239.239.1.2, etc.
  • Destination UDP port: It can really be anything in the range 1-65535. Traditional values are 1234 (old) and 5004 (new). Let’s use 5004 since it’s outside the range of traditionally reserved (and administratively protected) ports.
  • Time to Live (TTL): Let’s set it low. This is how we control where the stream can flood. If you only have one subnet in your network (e.g., single AP, with a single network/VLAN connected to it), then you can set your TTL to 1. If you have multiple networks, separated by a router, then you’ll want to choose a TTL of 2. If you have a hierarchy of networks, separated by multiple routers, then set TTL accordingly. Just remember that TTL is decremented on every router hop, and TTL=0 means the packet won’t be forwarded.
  • Type of Service (TOS/DSCP, this is optional): If you want to mark your streaming traffic as real-time traffic, then we’ll use the hexadecimal value 0xC0 (that’s zero-hex-charlie-zero).


So let’s test. Let’s stream to rtp://239.239.1.1:5004/ with a TTL=1 and DSCP marking of 0xC0. On the server, type the following command (all one line):
cvlc -v {{path_to_media/playlist.m3u}} --sout '#rtp{mux=ts,dst=239.239.1.1,port=5004,ttl=1}' --sout-keep --loop --dscp 0xC0
That should spit a whole bunch of log messages at you, but it should come to a rest after a few seconds. If it continues to stream log messages at you for 15 seconds or more, then hit CTRL-C and scroll back through the messages to see what the issue is. Typically at this point it’s a syntax error on the command line -- wrong filename/path, something misspelled, missing double-dash, etc. Review carefully.

Once things to quiet down nicely, grab your favorite client. Open VLC, select Network Stream, and enter the URL rtp://239.239.1.1:5004/ when prompted. In a few seconds, you should see your video streaming on the client. If not, time to review log messages on the server again. Rinse and repeat until you can see your video stream on the client.

Start the Stream as a Service

Now that you've had a successful test from the command line, it's time to make it all happen automatically.  Start by creating a user account that will simply be responsible for the operation of VLC:
sudo adduser --system --home /etc/vlc vlc
That will create a system account named vlc, with a home directory of /etc/vlc.  Now lets put something in the vlc user's home directory, namely a script to automatically launch the stream.  Using your favorite editor, create a new file named /etc/vlc/start-vlc.sh and add the following lines to the file:
#!/bin/sh
sudo -u vlc {{command line you ran in the test, and worked}} > /dev/null
Save and close that file.  Now set the ownership and permissions correctly:
sudo chown vlc /etc/vlc/start-vlc.sh
sudo chmod 755 /etc/vlc/start-vlc.sh
Next, open up /etc/rc.local and add the following lines, just before the exit 0 at the end:
# Start streaming with vlc
/etc/vlc/start-vlc.sh &
Save your changes and exit.  The stream should now start automatically if/when you reboot the server.  Go ahead and try it out -- reboot the server.  Once it's back up and running, run the command:
ps ax | grep vlc | grep -v grep
That should return about four lines/processes in the output, two of which will contain a string that looks very similar to the command string you tested with.  If that's the case, fire up your VLC client and verify that you can see the stream.  If you can't, or if the output from your ps doesn't look right, then go back and double-check the /etc/vlc/start-vlc.sh and /etc/rc.local files.  If there's no obvious syntax error, you can run the start script yourself to see what happens:
sudo /etc/vlc/start-vlc.sh
Of course, if that works, your problem is probably in the /etc/rc.local file.  If it doesn't, look at /etc/vlc/start-vlc.sh.  You know the routine: tweak, test, repeat until it works...

Configure the SAP Server

Once you've successfully started the stream automatically, it's time to advertise your stream to interested clients.  The SAP server is responsible for that.  We've already installed minisapserver, so all that's left is to configure it.  So open /etc/sap.cfg in your favorite editor, and set the following:
sap_delay=5
interface={{ifname}}
Replace {{ifname}} with the name of your streaming interface.  Again, in Ubuntu 14.04, it probably looks like p2p1, or something similar.  Under the [program] section, create an entry similar to the following for each stream you want to advertise:
[program]
name={{ friendly name for channel/stream }}
user=videolan
machine={{ server IP address }}
site=-
address={{ multicast address you used in the cvlc command }}
port={{ port number you used in the cvlc command }}
Again, you will need one [program] entry for each stream or channel you want to advertise.

Next, edit the /etc/default/minisapserver file and make sure RUN=yes is set in the config file.  Save and close, if necessary.  All that's left is to (re)start the SAP server:
sudo /etc/init.d/minisapserver restart
That's it!  Crack open a cold one, open your VLC client, and browse the Local Network option.  You should see your stream advertised there.  Click to watch!

Final Thoughts

This is not a perfect process.  And in fact, lots of things will be broken in different ways.  What I've found so far...

First VLC really is awesome.  It's a powerful tool that can be used to both transmit and watch media files, from lots of sources, and on lots of platforms.  But for all its awesomeness, it is a volunteer-based, open source project.  And as such, it has some warts:

  • We've already discussed that VLC for Linux, versions later than 2.1.x, has issues streaming with the ts-mux.  If you try a 2.2.x release of VLC for Linux, it will break streaming!
  • Multicast in VLC for Mac is completely broken.  I'm testing with version 3.0.0.  It will not stream.  And when I try to receive I multicast stream, I can confirm that VLC never sends an IGMP join for the multicast group (address) that contains the video stream.
  • The XBOX One app is a pretty nice client, but you can not actually click-to-start a stream that shows up as a SAP advertisement when you browse the Local Network.  You can manually add the stream by typing in the URL in the Network Stream source.
  • The Android app behaves a lot like the XBOX One app.  You can manually add the stream, but you can't tap-to-launch from the SAP advertisement.
  • The iOS app seems to be the nicest client.  You can actually tap the SAP advertisement under Local Network, and it works as expected.  So start here, if you can, for maximum enjoyment.
  • I can't speak to the Windows client (sorry).

Streaming multicast over WiFi can be a bit challenging as well.  I saw really poor performance with my Ubiquiti UniFi system when I started.  I did some Googling that seemed to indicate that you needed to set Multicast Enhancement in your WiFi advanced options.  I'm really not sure why that made the performance better (it seems to only enable IGMP snooping), but it does.  There's this odd statement on the help for the Block LAN to WAN Multicast and Broadcast Data option that says, "Multicast/Broadcast data is sent out at the lowest modulation rate..."  So, maybe by turning on IGMP snooping, you're telling the AP to use better modulation rates if there are interested listeners?  Just a guess, I don't know for sure.  But if your WiFi solution provides options for improving multicast performance, I suggest enabling them.

IGMP snooping has some issues as well.  Basically, switches that perform IGMP snooping seem to assume that the multicast stream does not originate on the local subnet, or if it does, it originates on a switch connected to the IGMP querier.  Because of the way my network is connected between the ISP NID (my router is in the basement, close to the NID) and my office (where all the servers and storage are, three switches away), this is not my setup.  Unfortunately, IGMP snooping will absorb a join request from an interested host and build the multicast tree towards the router (IGMP querier) even if the source is in the other direction.  So you can turn off snooping, in which case the multicast stream is flooded to all switch ports on a VLAN with interested listeners, or you can create static joins in the downstream direction.  As I don't have many streams to manage (at least not now), I chose the latter.

Well, that's it!  It was a lot of fun putting the system together.  I got the chance to exercise some sysadmin skills, some networking skills, and in the end I have an private, in-home TV channel that streams Avengers movies 24 hours a day, seven days a week.  What better reward could there be?

Friday, March 03, 2017

My Quest to Demonstrate EVPN Multihoming to the Server in a Virtualized Data Center Topology

I recently had a customer ask if I could help him develop a proof of concept to illustrate how EVPN multihoming could be used in a datacenter environment to replace technologies like Link Aggregation Groups (LAG) and Multi-Chassis LAG (MC-LAG) to support connections to dual-attached servers.  Eager to prove the concept out, and lacking in physical hardware to rapidly build and test the topology, I decided to try to implement the POC in a KVM environment, with Wistar acting as the topology manager for the environment.  Fundamentally, I was building what has become a "standard" layer 3 Clos datacenter fabric composed of Juniper virtual QFX switches attached to some Ubuntu servers.  Seemed simple enough, but clearly I hadn't thought through all the details...

Standing up the virtualized physical topology was simple enough -- two vQFX spine switches, three vQFX leaf switches.  Connect one server to Leaf #3 by a single Ethernet interface.  Connect the other server to Leaf #1 and #2 -- a single Ethernet interface from the server to each switch.  (Just for fun, I planted a Juniper virtual MX at the top as DC edge router.  Not necessary for the POC, but still fun to play with.)  In all, the topology looks like this:


I had previously written an Ansible playbook that automagically built the configs for each of the Spine and Leaf devices in the topology.  It wasn't perfect, as it only accounted for single (rather than dual) attached servers, but it got me to 98% configured in five minutes.  All that was left was to configure the aggregated Ethernet (ae) interface and its respective child links on Leaf #1 and #2.  Simple.

I started the topology and quickly realized I had a problem.  I was running standard Juniper LAG on the vQFX leaf nodes -- basically that means setting LACP active on the AE interface.  And I had set up bonding on Server #1 for mode 4 (LACP) with hashing set to layer3+4, as close to an analogous config as I could get on the server with respect to the Junos configuration.  But the AE link was down on the switch side...  Careful inspection revealed that no LACP frames were reaching the server.  The vQFX switches were sending them, but the server was not receiving them.  Nor was the server actually sending any LACP of its own.

It took a few seconds for the input to process...  The server was not connected to the vQFX leaf nodes via virtual wires, it was connected via Linux bridges!  And LACP frames are of a format (01:80:c2:...) that 802.1D compliant switches do not forward.  So the LACP exchange between the leaf nodes and the server were being swallowed by a couple of Linux bridges...

The good news is that there's a fix for that!  Since the 3.2 kernel, developers have included a little tool called a Group Forward Mask that allows you to direct the bridge to ignore (and hence forward) certain layer 2 protocols.  You can write to the Mask in this way:

echo maskValue > /sys/class/net/brXXX/bridge/group_fwd_mask

where maskValue sets the bits in the lower half of the MAC address that you want the bridge to ignore/forward, and brXXX is the name of the bridge where you want to implement this change.  Simple, and the scope is reasonably limited.  So I just wrote 255 to the mask so that my bridges-in-question would just ignore any values in the last octet and forward all of the potential "interesting" traffic (Spanning Tree, LACP, LLDP, etc.) onwards.  Except it didn't work.  I tested with LLDP, and that worked beautifully.  But I still couldn't get my Linux bridges to forward LACP frames.  Hmm...

So a little more reading turned up something else.  The folks who implemented the group_fwd_mask change were afraid of folks like me, and they were concerned that allowing too many protocols (like Spanning Tree and LACP) through could be disastrous -- and they're absolutely correct.  So they implemented another feature in conjunction with the group_fwd_mask -- a #define named BR_GROUPFWD_RESTRICTED that is set to 0x7u, which prevents you from modifying the lowest three bits in the forwarding mask.  So you can't change the bridge behavior to permit Spanning Tree (01:80:c2:00:00:01) or LACP (01:80:c2:00:00:02), among others.

Now it's on.  That define is contained in the Linux source tree in the net/bridge/br_private.h header file.  So, add the Linux source package to the hypervisor platform, recompile with BR_GROUPFWD_RESTRICTED set to 0x0u, and reboot on the new kernel.  Then "echo 255 > /sys/class/net/t5_br9/bridge/group_fwd_mask ; echo 255 > /sys/class/net/t5_br10/bridge/group_fwd_mask" and go.  Take that people!

That only half-fixed the issue.  At this point, I could see LACP frames from the vQFX leaf nodes reaching Server #1.  Literally, looking at the output from tcpdump -e -i ens4 (one of my child links on the server) showed perfectly-formatted LACP from the vQFX reaching the server child link.  But the output from "cat /proc/net/bonding/bond0" seemed to indicate that the server wasn't actually processing the frames.  The syslog output seemed to corroborate that as well, as no child links were joining the bond.  And on top of that, the server was not sending any LACP frames.  And yet, LLDP was working great.  And if I forced the vQFX side up by committing "set interfaces ae0 aggregated-ether-options lacp force-up" then I could pass traffic between Server #1 and Server #2.  Weird...

I spent a couple more days occasionally Googling variations of "lacp" and "ubuntu" with "active mode" and "not receiving" to look for answers.  And I tried all permutations of configuration in /etc/network/interfaces where I did or did not add explicit slave devices to the bond interface, and where I did or did not assign a bond master to the child links ... ifdown/ifup combinations ... reboots ...etc.  Then I found this statement that was both odd and obvious on The Geek Stuff Blog:

If the Speed, duplex & Link status is unknown then the interface may be in down status. Try to bring up the interface using “ifconfig up”. If you still do not see the link then the interface is not connected to the switch.

I knew my server and switches were properly connected through the Linux bridges, and they were clearly working, as it seemed I could pass everything BUT stinkin' LACP over the links.  But what did I have to lose?  I checked the output from "ethtool ens4", and sure enough, ethtool reported a whole lot of nothing:

Settings for ens4:
Supported ports: [ ]
Supported link modes:   Not reported
Supported pause frame use: No
Supports auto-negotiation: No
Advertised link modes:  Not reported
Advertised pause frame use: No
Advertised auto-negotiation: No
Speed: Unknown!
Duplex: Unknown! (255)
Port: Other
PHYAD: 0
Transceiver: internal
Auto-negotiation: off
Link detected: yes

So back to check the config for my server Ethernet ports.  It seems that when Wistar built the topology, the network interfaces were created as type virtio.  That clearly wasn't working, so what about good ol' e1000?  Shut the server down, changed the AE child links to type e1000 and rebooted.  Now the ethtool output looks like this:

Settings for ens4:
Supported ports: [ TP ]
Supported link modes:   10baseT/Half 10baseT/Full
                       100baseT/Half 100baseT/Full
                       1000baseT/Full
Supported pause frame use: No
Supports auto-negotiation: Yes
Advertised link modes:  10baseT/Half 10baseT/Full
                       100baseT/Half 100baseT/Full
                       1000baseT/Full
Advertised pause frame use: No
Advertised auto-negotiation: Yes
Speed: 1000Mb/s
Duplex: Full
Port: Twisted Pair
PHYAD: 0
Transceiver: internal
Auto-negotiation: on
MDI-X: off (auto)
Cannot get wake-on-lan settings: Operation not permitted
Current message level: 0x00000007 (7)
      drv probe link
Link detected: yes

Well that's a lot better!  And all of the sudden, the output from "cat /proc/net/bonding/bond0" shows healthy data.  And my vQFX's see LACP from the server.  Rolling back the vQFX configs to drop the "force-up" change, it all still works.  Flawless!  I've been running Ping, SSH, SCP, etc. without issue ever since.  And I can even see the various sessions being load-balanced across the child links in the bonded interface!

So there you have it.  While literally nothing else cares about the speed and duplex settings reported out by ethtool, LACP cares.  And without real data to indicate that the links are up, Linux will not do any LACP processing.  And if you want to build your own EVPN multi-homing POC on KVM, remember these important points:

  1. You will need to build your own kernel to tweak the BR_GROUPFWD_RESTRICTED define so that you can manipulate the lower three bits in the group_fwd_mask.
  2. Running that kernel, you will need to write a value to the group_fwd_mask for the correct Linux bridge that directs it to forward LACP.  Do this with great care, as there is a reason why 802.1D bridges do not forward this traffic by default.  Best to ensure that the switch in question only has two connected devices/interfaces -- the ones at each end of your virtual wire.
  3. You will also need to be sure that the virtual network interface you use on your virtual Linux hosts properly reports interface state in ethtool.  In my case, virtio did not, but e1000 did.


Happy hacking!