pantz.org banner
SaltStack Minion communication and missing returns
Posted on 05-30-2016 04:42:57 UTC | Updated on 05-30-2016 05:17:50 UTC
Section: /software/saltstack/ | Permanent Link

Setting up SaltStack is a fairly easy task. There is plenty of documentation here. This is not an install tutorial, this is an explanation and trouble shooting of what is going on with SaltStack Master and Minion communication. Mostly when using the CLI to send commands from the Master to the Minions.

Basic Check List

After you have installed your Salt Master and your Salt Minions software the first thing to do after starting your Master is open your Minion's config file in /etc/salt/minion and fill out the line "master: " to tell the Minion where his Master is. Then start/restart your Salt Minion. Do this for all your Minions.

Go back to the Master and accept all of of the Minions keys. See here on how to do this. If you don't see a certain Minions key here are some things you should check.

  1. Is your Minion and Master running the same software version? The Master can usually work at a higher version. Try to keep them the same if possible.
  2. Is your salt-Minion service running? Make sure it is set to run on start as well.
  3. Has the Minions key been accepted by the Master? If you don't even see a key request from the Minion then the Minion is not even talking to the Master .
  4. Does the Minion have an unobstructed network path back to TCP port 4505 on the Master? The Minions initialize a TCP connection back to the Master so they don't need any ports open. Watch out for those Firewalls.
  5. Check your Minions log file in /var/log/salt/minion for key issues or any other issues.

Basic Communication

Now lets say you have all of basic network and and key issues worked out and would like to send some jobs to your Minions. You can do this via the Salt CLI. Something like salt \* cmd.run 'echo HI'. This is considered a job by Salt. The Minions get this request and run the command and return the job information to the Master. The CLI talks to the Master who is listening for the return messages as they are coming in on the ZMQ bus. The CLI then reports back that status and output of the job.

That is a basic view of this process. But, sometimes Minions don't return job information. Then you ask yourself what the heck happened. You know the Minion is running fine. Eventually you find out you don't really understand Minion Master job communication at all.

Detailed Breakdown of Master Minion CLI Communication

By default when the job information gets returned to the Master and is stored on disk in the job cache. We will assume this is the case below.

The Salt CLI is just an small bit of code that interfaces with the API SaltStack has written that allows anyone to send commands to the Minions programmatically. The CLI is not connected directly to the Minions when the job request is made. When the CLI makes a job request, is handed to the Master to fulfill.

There are 2 key timeout periods you need be aware of before we go into a explanation of how a job request is handled. They are "timeout" and "gather_job_timeout".

When the CLI command is issued, the Master gathers a list of Minions with valid keys so it knows which Minions are on the system. It validates and filters the targeting information from the given target list and sets that as its list (targets) of Minions for the job. Now the Master has a list of who should return information when queried. The Master takes the requested command, target list, job id, and a few pieces of info, and broadcasts a message on the ZeroMQ bus to all of Minions. When all Minions get the message, they look at the target list and decide if they should execute the command or not. If the Minion sees he is in the target list he executes the command. If a Minion sees he is not part of the target list, he just ignores the message. The Minion that decided to run the command creates an local job id for the job and then performs the work.

When ever the Minion finishes job, it will return a message to the Master with the output of the job. The Master will mark that Minion off the target list as "Job Finished". While the Minions are working their jobs the Master is asking all of the Minions every 5 seconds (timeout) if they are still running the job they were assigned by providing them with the job id from the original job request. The Minion responds to "find job" request with a status of "still working" or "Job Finished". If a Minion does not respond to the request within the gather_job_timeout time period (5 secs), the Master marks the Minion as "non responsive" for the polling interval. All Minions will keep being queried on the polling interval until all of the responding Minions have been marked as responding with "Job Finished" as some point. Any Minion not responding during each of these intervals will keep being marked as "non responsive". After the last Minion that has been responding responds with "Job Finished", the Master polls the Minions one last time. Any Minion that has not responded in the final polling period is marked as "non responsive".

The CLI will show the output from the Minions as they finish their jobs. For the Minions that did not respond, but are connected to the Master, you will see the message "Minion did not return". If a Minion does not even look like it has a TCP connection with the Master, you will see "Minion did not return. [Not connected]".

By this time the Master should have marked the job as finished. The jobs info should now be available in the job cache. The above explanation is a high level explanation of how Master and Minions communicate. There are more details to this process than the above info, but this should give you a basic idea of how it works.

Takeaways From This Info

  1. There is no defined period on how long a job will take. The job will finish when the last responsive Minion has said it is done.
  2. If a Minion is not up or connected when a job request it sent out, then the Minion just misses that job. It is _not_ queued by the Master, and sent at a later time.
  3. Currently there is no hard timeout to force the Master to stop listening after a certain amount of time.
  4. If you set your timeout (-t) to be something silly like 3600, then if even one Minion is not responding the CLI will wait the full 3600 seconds to return. Beware!

Missing Returns

Sometimes you know there are Minions up and working, but you get "Minion did not return" or you did not see any info from the Minion at all before the CLI timed out. It is frustrating, as you can send the same Minion that just failed a job and it finishes it with no problem. There can be many reasons for this. Try/check the following things.

Del.icio.us! | Digg Me! | Reddit!

Related stories

Intel Skylake with an Asus Z170-AR and Ubuntu Linux 14.04
Posted on 10-26-2015 23:14:48 UTC | Updated on 10-26-2015 23:30:20 UTC
Section: /hardware/motherboard/ | Permanent Link

I bought a Asus Z170-AR (the -A is almost identical except for DVI and VGA port). This motherboard (mobo) runs the latest Intel Skylake processors and has some of the newest chipsets on board. This means that you will likely need one of the newest Linux kernels out there to support these chipsets. But, if your running Ubuntu 14.04 LTS and want to run the latest hardware your going to have a problem.

Be advised this was an mobo/ram/proc upgrade and the OS was an original 14.04 install. 14.04.2 and newer point releases will ship with a much later and up to date kernel according to Canonical.

Older kernels and newer hardware

Very new hardware usually needs the latest drivers to work correctly. The closer you are to having the latest Linux kernel the better your chance is that your newer hardware will work. The newer the kernel the newer the driver. This is something that does not work well with Linux distributions that have long term releases in which the major kernel version does not get updated frequently. This is not to say you can go get more up to date kernel modules and build them but, when talking about built in support usually the kernels only get security updates and other fixes. Usually there are no features or support added. This does not bode well for using Ubuntu LTS releases on new hardware.

Ubuntu LTS

Ubuntu made LTS releases so people could stay on them for up to 5 years and not worry about upgrading every 6 months or not getting updates anymore. It is fantastic idea. The only problem is if you stay on an LTS release your were always stuck with the kernel that the release came with. Not any more.

Enter the LTS Enablement Stack

Canonical (the makers of Ubuntu) decided that they wanted LTS releases to be able to be used on newer hardware. They wanted everyone who wanted the stability of an LTS release to be able to use a newer Linux kernel. This would enable to people to buy newer hardware and have better support for said hardware.

To make this a reality Canonical decided to make the LTS Enablement Stack.This set of packages are much more updated Linux kernel and X support for existing LTS releases.

Asus z170 mobo and chipsets

As stated I upgraded a machine to an Asus Z170-AR motherboard with Ubuntu 14.04 already installed. Everything worked well except for the network adapter and the on board audio. These are some pretty important things so I just had to get them working. The following is what I did to do to get them working in Ubuntu 14.04.

Fixing the Ethernet adapter

With the stock 14.04 kernel (3.13.x) the Intel i219-V chip on the motherboard does not work. It also does not work with kernel version 3.15.x either. I had to go all the way to the latest 3.19.x kernel version to get the e1000e driver working with this adapter. To do this I installed the LTS enablement stack for 14.04. Follow that link and run the apt-get line. After reboot your ethernet adapter should be working.

Fixing the audio

Getting the Realtek ALC892 chip on the motherboard working was much harder than just updating the kernel. The 3.19.x kernel did not have a working sound kernel module for this chip. I could see the chip listed using lspci so I knew it was being seen by the kernel.

# sudo lspci -vv

00:1f.3 Audio device: Intel Corporation Device a170 (rev 31)
        Subsystem: ASUSTeK Computer Inc. Device 8698
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
        Stepping- SERR- FastB2B- D
...
        Kernel driver in use: snd_hda_intel

It even said there was a kernel driver being used for it. Which was a good sign, but still there was no sound. Alsa did not show any cards listed either. If you run aplay and your audio chip is recognized by the kernel it will show up in Alsa.

# aplay -l

Not until I grepped the dmesg did I find out why the kernel module was loaded but it was still not working. I saw the following in dmesg output greping for the PCI address listed in the lspci output.

# dmesg | grep '00:1f.3'

[   10.538405] snd_hda_intel 0000:00:1f.3: failed to add i915_bpo component
master (-19)

Now I have something to look for to see if anyone else was seeing the same error. Google turned up this post from someone with the exact same problem. Answer was to update to a very cutting edge Alsa kernel module. So I followed the instructions here to update 14.04's Alsa sound system with the latest kernel module. Be sure to choose the correct kernel module .deb package for your kernel on the "ALSA daily build snapshots" page. If your using 14.04.x choose "oem-audio-hda-daily-lts-vivid-dkms - ... ubuntu14.04.1" After installing the .deb package and rebooting, "aplay-l" saw the audio chip. Hooray!

Last thing to do to get the audio fully working is to choose the correct output in your desktops sound system. I use the XFCE desktop so what I had to do was Start "Pulse Audio Volume Control" then go to the "Configuration" tab and turn "HDA Nvidia" -> Off for my Nvida video card. Then in "Built in Audio" I had to select "Digital Stereo IEC958 Output" for my output. There are lots of outputs. My suggestion is to play an audio file on repeat, then go through each audio output to find the one that works.

Del.icio.us! | Digg Me! | Reddit!

Related stories

Netflix streaming playback speed and hidden menus
Posted on 05-16-2015 19:41:47 UTC | Updated on 05-17-2015 17:35:20 UTC
Section: /software/chrome/ | Permanent Link

Netflix and playback speed

I have always been interested in Netflix streaming, but I could not get over not being able to adjust the playback speed. That and the movie selection stinks (out of 159 titles in my queue only 40 are available to be streamed). I still tried streaming for 30 days and really liked some of the original content Netflix had. During that time I did not find any native way to adjust the playback speed in their html5 player. This really stinks as I am used to watching video with Vlc, YouTube, and MythTV all of which allow for playback speed adjustment up to 2x. It is hard to watch anything at lower speeds anymore. After a bit more searching I finally found a way to change the playback speed in Netflix streaming video.

Chrome extension Video Speed Controller

After much searching I found an Google Chrome extension called Video Speed Controller, which allows for the speedup of html5 video streams in the browser. Netflix has an html5 player so this works with Netflix streams. Hooray!! The speedup stream does not have any buffering issues and I've watched many things on with 2x playback without issue. Unfortunately you can't use this anywhere other than the browser. Come on Netflix please add playback speed adjust to all of your players on all platforms!!!

Performance at higher speeds

Beware that when speeding up video you will need to make sure your video card can handle the higher frame rates. When I stream a video file from local storage with a local program like Vlc at 2x, it is very smooth, without much jitter or video tearing. When streaming content from Netflix or YouTube or other sites using this browser plugin you are using the browser as the video player, and it might not be as clean of an experience. You might notice some more jitter in the video and some frame tearing on fast moving scenes. This usually happens if your hardware can't keep up rendering the video or if hardware acceleration is not enabled.

Chrome hardware acceleration

My testing for this is being done on Linux with Google Chrome using the accelerated Nvidia Drivers with a GeForce GT 520 a video card from 2011. What I had to do to get rid of the jitter and tearing is turn hardware acceleration on in Chrome. Google seems to be very careful about whether or not they turn on hardware acceleration. They usually don't turn it on unless they know for sure it will work. Mine was not turned on even though I had a video card that supported acceleration and the correct drivers. Chromes detection of this might not be the best or Google did not think the drivers for this were stable enough. What is great is you can turn this on anyways and it works great.

Turn on hardware acceleration in Chrome

To see if your hardware acceleration is already turned on or not type "chrome://gpu" in your Chrome URL bar. If it's turned off you will likely see lots of red text indicating so. Look for "Video decode" and see if it says "Disabled". If so you will need to turn it on. If it says "Hardware accelerated" then your good and can skip the rest of this paragraph. To turn on the hardware acceleration in the Chrome URL bar type "chrome://flags". The first setting says "Override software rendering list". Click the "Enable" to turn on the override. Then restart Chrome. Go back to "chrome://gpu" and see if it says "Video decode: Hardware accelerated". If so go try to watch a movie again or a YouTube video and it should be much smoother and likely have less tearing. If you don't have less tearing, go into your video driver config and make sure you have Sync to VBlank enabled. Mine is in my Nvidia X server settings manager then under "OpenGL settings"-> Performance. Check the box "Sync to VBlank".

Other neat Netflix streaming things

Netflix provides some hidden menus for stats and changing bit rates of your streams. Here are some different ones to try.I tried these using Chrome on Linux. They all worked for me.

Del.icio.us! | Digg Me! | Reddit!

Related stories


RSS Feed RSS feed logo
About


3com
3ware
alsa
alsactl
alsamixer
amd
android
apache
areca
arm
ati
auditd
awk
badblocks
bash
bind
bios
bonnie
cable
carp
cat5
cdrom
cellphone
centos
chart
chrome
cifs
cisco
cloudera
comcast
commands
comodo
compiz-fusion
corsair
cpufreq
cpufrequtils
cpuspeed
cron
crontab
crossover
cu
cups
cvs
database
dbus
dd
dd_rescue
ddclient
debian
decimal
dhclient
dhcp
diagnostic
diskexplorer
disks
dns
dos
dovecot
drac
dsniff
dvdauthor
e-mail
echo
editor
emerald
ethernet
expect
ext3
ext4
fat32
fedora
fetchmail
fiber
filesystems
firefox
firewall
flac
flexlm
floppy
flowtools
fonts
format
freebsd
ftp
gdm
gnome
greasemonkey
greylisting
growisofs
grub
hacking
hadoop
harddrive
hba
hex
hfsc
html
html5
http
https
idl
ie
ilo
intel
ios
iperf
ipmi
iptables
ipv6
irix
javascript
kde
kernel
kickstart
kmail
kprinter
krecord
kubuntu
kvm
lame
ldap
linux
logfile
lp
lpq
lpr
maradns
matlab
memory
mencoder
mhdd
mkinitrd
mkisofs
moinmoin
motherboard
mouse
movemail
mplayer
multitail
mutt
myodbc
mysql
mythtv
nagios
nameserver
netflix
netflow
nginx
nic
ntfs
ntp
nvidia
odbc
openbsd
openntpd
openoffice
openssh
openssl
opteron
parted
partimage
patch
perl
pf
pfflowd
pfsync
photorec
php
pop3
pop3s
ports
postfix
power
procmail
proftpd
proxy
pulseaudio
putty
pxe
python
qemu
r-studio
raid
recovery
redhat
router
rpc
rsync
saltstack
samba
schedule
screen
scsi
seagate
seatools
sed
sendmail
sgi
shell
siw
smtp
snort
solaris
soundcard
sox
spam
spamd
sql
sqlite
squid
ssh
ssh.com
ssl
su
subnet
subversion
sudo
sun
supermicro
switches
symbols
syslinux
syslog
systemrescuecd
t1
tcpip
tcpwrappers
telnet
terminal
testdisk
tftp
thttpd
thunderbird
timezone
ting
tls
tools
tr
trac
tuning
tunnel
ubuntu
vi
wget
wiki
windows
windowsxp
wireless
wpa_supplicant
x
xauth
xfree86
xfs
xinearama
xmms
youtube
zdump
zeromq
zic
zlib