
I need to install a MySQL database.
To check if you have a MySQL database server running, you can type
# sudo service mysql status
or
# mysql
If you get the error
Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock'
Then it usually means that MySQL is not installed.
So in my case, since I have a brand-new server that doesn't have MySQL installed by default, I need to install MySQL.
# sudo apt install mysql-server
Check if it is running.
# sudo service mysql status
It should now indicate that MySQL is active.
I created an A name record at my DNS provider for zabbix.your-domain.tld and entered the ip address of the server.
Adding a domain name to your Zabbix Server is optional.
The Zabbix Server doesn't have any transport encryption enabled yet, so any messages passed between our browser and the server are in plain text. We should secure our server asap with an SSL certificate.
I create the certificate using options provided by LetsEncypt. This has the added benefit of being free.
So, we need to ssh onto the Zabbix Server and install certbot
# sudo snap install --classic certbot
# sudo ln -s /snap/bin/certbot /usr/bin/certbot
# sudo certbot --apache
Follow the prompts, and at the end, your Zabbix Server will now have an SSL certificate bound and accessed via https.
I show how my firewall has been set up so far into the course.
We look over at what we have so far, and how we got there.
Check status of services.
# sudo service zabbix-server status
# sudo service zabbix-agent status
# sudo service apache2 status
# sudo service mysql status
Zabbix Server configuration file
# sudo nano /etc/zabbix/zabbix_server.conf
Zabbix Agent configuration file
# sudo nano /etc/zabbix/zabbix_agentd.conf
I install Zabbix Agent on an Ubuntu 24.04 on the same network as my Zabbix Server.
After organizing a new server, I then download and install the Zabbix repository on the server.
I re-visit the Zabbix download page at
https://www.zabbix.com/download
I have the Zabbix Packages tab active,
I then choose 7.0 LTS, Ubuntu and 20.04 Noble.
Also note that I am not installing the Zabbix server again, so it doesn't matter what selections I have for the database or web server.
Make sure you select the correct version for your operating system and architecture.
Also make sure that your agents are using the same version as your Zabbix server.
I download and Install the Zabbix agent for windows option.
I set this agents ServerActive parameter, and configure a template in Zabbix with Agent(Active) items only.
Since the template is using Agent(Active) items only, I do not need to create firewall forwarding rules.
I demonstrate auto registration of Zabbix Agents. My Zabbix agents are on a combination of networks and mixture of Linux and Windows.
When an agent starts up, it will make a request to the Zabbix server using the address of the serverActive parameter and commence collecting values for any items it receives back of type Zabbix agent active.
We can make use of this behavior to auto register the host at Zabbix server side in case no record of it already existed. This saves some time of you have many hosts that you want to add to the Zabbix server.
When the agent makes a request, we can configure several actions to run on the Zabbix server. In this example, I will read the value from the HostMetadata parameter in the Zabbix agents configuration file, and decide to either auto set it up as a Linux Host and assign a Linux template, or set it up as a Windows host and assign a Windows template.
By default, agent communication is done in clear text.
For encryption we have an option to use PSK-based encryption.
PSK means pre shared key.
The PSK option consists of two important values, the PSK identity and the PSK Secret.
The secret should be minimum a 128-bit (16-byte PSK, entered as 32 hexadecimal digits) up to 2048-bit (256-byte PSK, entered as 512 hexadecimal digits)
You can generate a 256 bit PSK secret with openssl using the command
# openssl rand -hex 32
In this lecture, I also save it straight to a file.
I first create and navigate to a folder
/home/zabbix/
I then run,
# openssl rand -hex 32 > secret.psk
I also make sure that only the Zabbix user can read the file.
# chown zabbix:zabbix secret.psk
# chmod 640 secret.psk
I then reconfigure the Zabbix agent configuration file.
# sudo nano /etc/zabbix/zabbix_agentd.conf
and change the options near the bottom,
TLSConnect=psk
TLSAccept=psk
TLSPSKFile=secret.psk
TLSPSKIdentity=[whatever you like]
I then restart the agent
# sudo service zabbix-agent restart
I then go into the Zabbix Server User interface and configure the PSK encryption options for the host.
I select the
'Connections to host' = PSK
'Connections from host' = PSK
'PSK Identity' = [what ever you used in the Zabbix agent config]
'PSK' = [the long hex string generated from the OpenSSL command above]
After a minute or two, the Zabbix Server and Agent will successfully communicate using PSK encryption.
I manually create 3 items on the host. 1 Passive and 2 Active.
The passive check is for agent.ping
The active checks are for disk space used and total.
In this section, we create a basic trigger.
The trigger will check for nodata from a host for 120 seconds.
I manually create 2 Graphs for this host. Before starting this section, you should manually create some new items for your host from the table in the documentation.
In this section, I show you how to create a template from the new host items, triggers and graphs that we've created in the last few videos. I also assign the new template to a newly discovered Windows host, and replace the original items, triggers and graphs from an earlier host with the new template that we've just created.
We look at creating custom dashboards for our templates.
We look at the Dashboards menu options.
I use Monitoring ⇾ Maps to create a network map from scratch. I show how to link the map icon to the host so that the host problems also appear on the graphical representation.
In this lecture I create an advanced item. The item reads the Windows event logs and looks for a specific windows event ID 4625 which is also known as 'failed logon'.
The item type is Zabbix Agent (Active)
and the key is
eventlog[Security,,,,4625,,skip]
The type of information is Log
The duration to keep the data and the frequency of checking for the item is up to you.
I then try to log on to my Windows laptop and generate some failed logins.
I then see the failed login events on the Monitoring ⇾ Latest Data page.
It may be useful to set up a trigger for failed logons.
In the video, I create the trigger using the expression
logeventid(/Windows Basic/eventlog[Security,,,,4625,,skip])=1
and also enable Allow manual close
In this lecture, I add a pre-processing step to the item that instructs the agent to return only the 1st line of the windows failed logon event description.
The pre-processing regex value is
(.*)
Which means to find the start of the line to the end,
And the output is
\0
Which indicates the first value found by the regex.
So, all in all, it returns only the first line.
I demonstrate how to use JavaScript to pre-process incoming item information.
In this video we look at using Web Scenarios for monitoring a website running on a server.
Web scenarios consist of one or more HTTP requests known as "steps" which are executed in sequence.
Even though Web Scenarios are assigned in the host configuration, they are not executed by the host.
Web Scenarios are executed from the Zabbix server, or proxy if a host is monitored by proxy.
Many devices and services now provide REST APIs that you can query and get JSON formatted data as a response. In this video I will show you how to create an Item that uses the HTTP Agent type to query an external REST API and extract it's data.
In case you don't have a REST API handy that you can query, I have created a test API the you can use. It has several methods that return test data.
Monitoring Log Files - HTTP Status Codes of a Nginx Proxy
The file I monitor is located at /var/log/nginx/access.log
The default user that the zabbix agent uses does not have read access to many log files on the system.
You can usually add the zabbix user to a group to solve this problem.
The access.log file can be read by the adm group on Ubuntu, so I add the zabbix user to the adm group.
To find out which groups a log file can be read by, for example, I typed,
$ ls -lh /var/log/nginx/
This tells me that the access.log file can be read by the adm group.
Then I check which groups the user zabbix is part of,
$ groups zabbix
If it's not part of either group already, I then add it,
$ sudo usermod -a -G adm zabbix
and check again to confirm.
$ groups Zabbix
I can read the most recent log file entries by typing
$ tail -f /var/log/nginx/access.log
I then created an item for the host, with settings
Name: HTTP Status Codes
Type : Zabbix (active)
Key: log[/var/log/nginx/access.log,"^(\S+) (\S+) (\S+) \[([\w:\/]+\s[+\-]\d{4})\] \"(\S+)\s?(\S+)?\s?(\S+)?\" (\d{3}|-) (\d+|-)\s?\"?([^\"]*)\"?\s?\"?([^\"]*)\"",,,,\8,]
Type of Information : numeric (unsigned)
Update Interval : 5s
The regex value that I copy into regex101 is
^(\S+) (\S+) (\S+) \[([\w:\/]+\s[+\-]\d{4})\] \"(\S+)\s?(\S+)?\s?(\S+)?\" (\d{3}|-) (\d+|-)\s?\"?([^\"]*)\"?\s?\"?([^\"]*)\"
This regex can separate the values for Nginx and Apache logs.
The regex splits each row of the log into several groups. The HTTP Status code is in the 8th group.
I can also created triggers to notify on
101 Switching Protocols
301 Moved Permanently
302 Redirect
304 not modified
400 Bad Request
401 Unauthorised
403 Forbidden
404 Not found
500 Server Error
In this video I demonstrate creating a trigger to detect 10 or more HTTP 404 Errors in 1 minute.
In this video I expand the log file monitoring item from the last video, into 1 master item that returns the whole log line, and then create several dependent items from the specific data contained in the log line.
Creating dependent items means that the agent doesn't need to run possibly identical queries on a host many times in order to extract parts of a value. The master item runs once on the host, and then the Zabbix server (or Zabbix proxy if host managed by proxy) updates the dependent items each time the master item gets its new values.
I also convert the new items into a template, create a graph and screen consisting of the graph created from the dependant item plus tables also showing values from the dependant items.
For pre Zabbix 5.02. In the zabbix_agentd.conf for the remote host, add EnableRemoteCommands=1 and then restart the agent process.
In Zabbix 5.0 and 5.01, you will also need to comment out the DenyKey parameter which blocks system.run by default, and then restart the agent process.
In Zabbix 5.02 and later, you can ignore EnableRemoteCommands=1 since it is now deprecated, and you should use a combination of DenyKey and AllowKey to fine tune the scripts you want to deny/allow.
In this lecture I use the agent running on my Zabbix server to monitor days remaining before SSL expiry by creating a custom script and executing it using the system.run item key option.
You can use any Linux agent you desire to run this script.
User Parameters in Zabbix.
I create several examples of UserParameters in this video.
Example 1
Starting as simple as possible.
I create an item to check isalive
Inside the Zabbix agent configuration file, on the host that will run the UserParameter, I add
UserParameter=isalive,echo 1
I save it, then I test it using,
$ zabbix_agentd -t isalive
I then restart the zabbix agent process,
$ sudo service zabbix-agent restart
I then add a new item to my host with,
name = is alive
key = isalive
type of information = numeric
I go to Monitoring-->Latest Data and wait for it to appear.
Example 2
And now for something a bit more complicated, and that is Flexible User Parameters
Inside the conf file I add,
UserParameter=isalive[*],echo $1
I can test it using
$ zabbix_agentd -t isalive[123]
Restart the zabbix agent process,
$ sudo service zabbix-agent restart
The item inside Zabbix Server has, name = is alive key = isalive[123456] type of information = text
Example 3
And now, I convert an existing system.run command to a UserParameter.
The script called in this system.run command is outlined in my previous lecture Check SSL Certificate Expiry on Websites using Custom Script and system.run
Inside the conf file I add,
UserParameter=checkssl[*],/home/zabbix/checkssl.sh $1 $2
I can test it using
$ zabbix_agentd -t checkssl[example.com,443]
Restart the zabbix agent process,
$ sudo service zabbix-agent restart
The item inside Zabbix Server has,
name = Check SSL example.com
key = checkssl[example.com,443]
type of information = numeric
I create 3 calculated items and a graph in this lecture.
1.
name = Changes MySql Questions
type = calculated
key = mysqlchanges
formula = change(mysql.questions)
2.
name = Average of Changes of MySql Questions
type = calculated
key = avgmysqlchanges
formula = avg(mysqlchanges,#5)
3.
name = Forecast of Changes of MySql Questions
type = calculated
key = forecastmysqlchanges
formula = forecast(mysqlchanges,1h,,10m,exponential)
In this lecture, I demonstrate how to use Global scripts (a.k.a Administration scripts) from the Zabbix Server, Zabbix Proxy and Zabbix Agents.
Configuration depends on
from which process, on which server, the script will actually be executed.
whether remote commands are enabled for the process executing the script,
whether the Zabbix user will require sudo privileges for any part of the script command.
whether or not the agent has any working passive checks.
In this video, I demonstrate how to setup and problem solve the pre existing administration scripts ping, and detect operating system.
When calling the ping administration script for a host behind a Zabbix Proxy, the Zabbix Proxy configuration will also need to be updated to allow remote commands. This is because the ping will be executed from the Zabbix proxy server. You should also restart the Zabbix proxy after any configuration change.
For the administration script used to detect the operating system, the nmap executable will be used from either the Zabbix Server or Zabbix Proxy, and it will need to be installed on the server or proxy first, and you will also need to allow the Zabbix user sudo privileges to execute it.
# sudo visudo
Add the line,
zabbix ALL=(ALL) NOPASSWD: /usr/bin/nmap
Press Ctrl-X and Y to save.
I will also demonstrate creating a script that runs from the Zabbix agents themselves. This will use the free command to list available memory.
Now that we have a few hosts setup in different template configurations and on different networks, we can experiment with Zabbix server health.
Values processed per second
Value processed per second (VPS) indicates how busy your Zabbix server is. This number may be high or low and is used as a guide to help you know when other issues may start to occur. If the number is higher than usual, and you are having no problems indicated in any of the other graphs then you can consider that ok. You can manage this value by enabling/disabling items, triggers and discovery rules for your hosts.
Utilization of data collectors
Depending on the types of items you have set up for your hosts, different pollers (data collectors) will be used to perform the task of requesting or receiving the item data.
Passive checks are managed by the poller data collector, ping checks by the ICMP data collector, web-scenarios by the http data collector, the trapper data collector handles incoming checks from active hosts and there are many other collectors handling different protocols.
When you make changes to a host, you can review this graph to see what impact it had.
Utilization of internal processes
Zabbix runs many internally scheduled tasks to do with housekeeping the SQL database, managing LLD, alerting, pre-processing, writing logs and more. Also monitor this graph to understand the impact of any changes you make.
Cache usage
The value cache is used to speed up calculations of trigger expressions, calculated items, dependent items and other things within Zabbix where it is more optimal to pull historical data straight from memory rather than re querying the database tables every time a value is needed.
The graph summarizes several caches used within Zabbix.
If any of the cache usages go above 80% then consider adjusting the Zabbix servers CacheSize setting.
The CacheSize setting is in the zabbix_server.conf file. The default is 8M. You can change this from 128K to 64GB. You will need to adjust this as you manage more hosts, especially if they have many triggers, calculated items, dependent items and other host related statistics and properties stored in the cache.
Value cache effectiveness
The two important values shown in this graph are related to hits and misses. A Hit is when a value was retrieved from memory. A miss happens when the data is not currently in memory, but needs to be retrieved from the database first. Aim to have as few misses as possible by increasing the CacheSize setting if necessary, or by reducing the amount of items and triggers you are processing for a host.
Queue size
Passive checks are placed into a queue and the request/response is handled as soon as possible. Some requests on hosts don't resolve quickly due to many reasons, such as it may be a complicated query to answer or the host may be experiencing other resource issues such as high CPU, low memory, low network bandwidth or even be switched off or just in the process of restarting. And so then there may be a backlog of unanswered requests waiting to be resolved.
Ideally you want the values in this graph to be as low as possible.
In the course we can see that one of the hosts has many unresolved requests in the queue. This can be caused by changing templates often or other adjustments to configurations that you may make to a host. In this example, my issue is caused by many passive checks not being quickly handled by my windows host behind the Proxy. I can fix this queue issue by reconfiguring it to use the active version of the Windows template. This particular fix won't always be the same for you. I could have also reduced the number of items it was trying to retrieve in order to make the template less demanding on my host.
To see a list of items in the queue, and which host they relate to, visit the page Administration ⇾ Queue ⇾ Queue details.
Summary
When adding hosts or making other changes to Zabbix then recheck the Zabbix Health dashboard regularly to get a good feel of what your change has done. Also note that the supplied templates will have many items, triggers, discovery rules and more enabled by default that you don't actually need. Disable everything that isn't critical for your use case to save resources when Zabbix health starts to indicate problems.
Introduction to configuring Zabbix Users, Groups and Roles.
The correct Zabbix repository was already installed from the last section, so we can just run
# sudo apt install zabbix-agent
On Linux hosts, to enable auto restart after reboot,
# sudo systemctl enable zabbix-agent.service
If your Zabbix Server and Proxy are communicating successfully, as can be verified in the Zabbix UI ⇾ Admin ⇾ proxies page, then we can now set up the agent on the proxy itself to retrieve items.
So, on the proxy server, enter
# sudo nano /etc/zabbix/zabbix_agentd.conf
This time I setup PSK encryption specifically for communications between the Zabbix Server and Zabbix Proxy. Enabling PSK encryption for Agents behind a Proxy, only encrypts communications between the Agent and the Proxy. If your agents are in a DMZ then you may not desire encryption. But you should at least also encrypt the communication between the Zabbix Server and Proxy if it travels across a public network.
I reconfigure my existing Zabbix Agents on my Centos7, Ubuntu20, Windows10 and Mac OSX to now use the new Zabbix Proxy we've just set up.
For each host that I need to be proxied, I need to edit it's agent config file to use the RaspberryPi for both it's Server and ServerActive settings.
I then go into the Zabbix UI and reconfigure the hosts to now be monitored by proxy. I also update the agent interface information to reference the servers from the perspective of the proxy. ie, the ip address needs to be the local address on the network, or use the host name which can be found in the local network.
In this lecture I install and configure Zabbix Agent to run on a macOS.
The mac is behind a firewall and I configure it to be managed via the Zabbix proxy.
After installing the Zabbix Agent on macOS, you will need to configure it using,
# sudo nano /usr/local/etc/zabbix/zabbix_agentd.conf
edit the Server, ServerActive and Hostname parameters, save and restart the agent.
The commands to restart Zabbix Agent on the macOS are,
# sudo launchctl unload /Library/LaunchDaemons/com.zabbix.zabbix_agentd.plist
then
# sudo launchctl load /Library/LaunchDaemons/com.zabbix.zabbix_agentd.plist
Go back to the Zabbix Server, add the host to be managed by the proxy. Then update the proxy cache.
# sudo zabbix_proxy -R config_cache_reload
We can add a Zabbix Proxy Health dashboard to our Proxy host configuration.
Go to Configuration ⇾ Hosts and select the proxy, and add the template Zabbix Proxy Health
SSH onto the Zabbix proxy and do a config cache reload.
# sudo zabbix_proxy -R config_cache_reload
Many corporate networks will have devices on them that support SNMP.
SNMP stands for Simple Network Management Protocol.
Some examples of common devices that support the SNMP protocol are routers, switches, printers and NAS servers. Also Linux, Windows and MacOS operating systems can be configured to support SNMP.
You would typically use SNMP on a device that doesn't have an operating system where you can install a Zabbix agent.
And not all devices support SNMP, but on my network, I do have a device that supports SNMP and that is a MikroTik 260GS switch. I will demonstrate how to set this up in Zabbix.
In this next few lectures we will look at SNMP
SNMP stands for Simple Network Management Protocol.
Common devices that support SNMP are routers, switches, printers, servers, workstations and other devices found on IP networks.
Not every network device supports SNMP, or has it enabled, and there is a good chance you don't have an SNMP enabled device available that you can use in this lecture.
So, in this next few lectures, I will demonstrate setting up SNMP on various devices.
We will set up Zabbix to query using OIDs first.
We will manually create a few sample SNMP items.
Then demonstrate what setup and querying with MIB descriptions. MIB stands for Management Information Base.
And then use LLD to discover new SNMP devices and automatically configure them in Zabbix.
I configure one of my hosts to use a more sophisticated SNMP template. I will need to allow less restrictive SNMP OID prefixes in the SNMPD process, and then restart.
Querying SNMP agents with MID description is likely to fail by default.
The example below will fail if no MIB descriptions are installed on the server executing the snmpwalk command,
$ snmpwalk -v 2c -c mycommunity ###.###.###.### IF-MIB::ifInOctets.1
> Cannot find module (IF-MIB)
> IF_MIB::ifInOctets.1: Unknown Object Identifier
We can enable querying by MIB descriptions by running this command on the Zabbix server itself.
$ sudo apt install snmp-mibs-downloader
Now this command will work
$ snmpwalk -v 2c -c mycommunity ###.###.###.### IF-MIB::ifInOctets.1
> IF-MIB::ifInOctets.1 = Counter32: 566637161
I then update my items for the host in Zabbix, to query using MIB descriptions.
And then restart the Zabbix server process.
$ sudo service zabbix-server restart
I set up a Network Discovery Rule to find and Action to auto configure all SNMP devices on my local network.
The rule will scan all internal IP address for accessible SNMP daemon system descriptions, read the response, and then use that response to add it to a server group, and to also auto configure it the Generic SNMP template.
My network has a TP-Link router and a Cisco switch which are both SNMP capable devices.
Receiving SNMP traps is the opposite of querying SNMP devices.
Information is sent from an SNMP device and is collected or "trapped" by Zabbix.
SNMP Traps are sent to the server on port 162 (as opposed to port 161 on the agent side that is used for queries).
So port 162 will need to be allowed on the Zabbix Server or Proxy, which ever will receive the SNMP traps.
For SNMP Traps to work, you need to configure some settings for either the Zabbix Server, or Zabbix Proxy.
Download zabbix_trap_receiver.pl
snmptrapd is an SNMP application that receives and logs SNMP TRAP and INFORM messages.
I demonstrate configuring my Cisco switch to send snmp traps to the server with snmptrapd listening. Zabbix proxy is also running on the same server, and will forward the messages onto the Zabbix server where the host is configured.
I demonstrate creating some triggers for some SNMP traps.
In the first trigger, I want a notification every time my specific snmptrap[Reload Command] item receives a value.
nodata(/Cisco Catalyst 2950 Switch/snmptrap[Reload Command],2m)=0
In the second trigger example, I search for some text in any incoming snmptrap.fallback item values.
find(/Cisco Catalyst 2950 Switch/snmptrap.fallback,,,"SNMPv2-MIB::coldStart")=1
Using everything we've seen so far, we can now move onto something much more advanced, and that is to create a custom Low Level Discovery Rule from the ground up.
The concept of a discovery rule, is to have a script that when run, will automatically discover groups of something on your host, or even network. For example, databases on a database server.
Some Inbuilt Native Zabbix Discovery rules are
vfs.fs.discovery : Used by Mounted Filesystem Discovery
net.if.discovery : Used by Network Interface Discovery
vfs.dev.discovery : Used by Block Devices Discovery
These rules are created and assigned to your hosts when you add certain templates.
You can inspect the output of these rules on your hosts by running,
# zabbix_agentd -t vfs.dev.discovery
# zabbix_agentd -t net.if.discovery
# zabbix_agentd -t vfs.fs.discovery
Each JSON output is dynamically created at runtime and contains a list of macro keys and values that will be specific to what was found from the perspective of the host where it was run. The macros can be used by the discovery rules Item, Trigger, Graph and Host Prototypes.
To start with this exercise, we will create a discovery rule to monitor chosen services running on a host (Ubuntu 20.04).
The discovery rule will be part of a template that can later be assigned to your hosts.
It is now time to create a template that uses the new service.discovery UserParameter as the key.
Create a new discovery rule
Create item and trigger prototypes
Add the template to host,
and test.
Hello, and welcome to my course on Zabbix,
Zabbix is a complete open source monitoring software solution for networks, operating systems and applications.
In this course you will install and extensively configure Zabbix Server and multiple Zabbix Agents on Windows and Linux whether on the same network, behind a firewall, on dedicated hardware locally and cloud hosted.
We will also look at,
Installing a Zabbix Server with Database,
Installing Multiple Agents on Ubuntu, CentOS and Windows,
Understanding Active an Passive Checks,
Enabling PSK Encryption for Security,
Host Items, Triggers, Graphs, and Dashboards,
Understanding Zabbix and Proxy health,
Installing Ubuntu, CentOS Operating Systems and Configuring Firewalls,
Alerting with SMTP,
Creating your own templates,
Creating a Network Map,
Reading Windows Event Logs,
Reading Linux system logs,
Item Pre-Processing using Regex, JavaScript and JSON Path,
Dependent items,
JSON API and HTTP monitoring,
Executing remote bat and sh scripts,
Custom User Parameters,
Global Scripts,
Calculated Items,
Auto Registration and Discovery,
Users, Groups and Roles,
and much more.
Zabbix can be used in the enterprise or even on you own home network where you can have much better visibility of the things connected and running on it and how they are used.
Thanks for taking part in my course, and let's get started.