Managing Oracle Management Cloud Agents

Since Oracle Management Cloud is a cloud service, there’s not a lot of maintenance tasks that the user has to take on. However, there’s a few things that you want to keep your eye on and receive alerts on to be sure that agents and data are flowing properly. This blog will give you a checklist of the best practices for managing and monitoring the agents that are supporting your OMC instance.

Status

Find the status of all your agents easily in the Administration > Agents page of the OMC console. This page provides a lot of information about each type of agent: Gateway, Data Collector, Cloud Agent and APM Agent. All tabs should be checked to ensure that all agents are up (Status is green arrow) and last check-in column shows a valid time within the last few minutes. If the last check-in time is more than 10 min old, it is likely having communication problems or latency.

If you notice the Last Check In time is older than a few minutes, check the agent itself to be sure there is no backlog and communication is working as expected.

./omcli status agent

Oracle Management Cloud Agent  
Copyright (c) 1996, 2019 Oracle Corporation.  All rights reserved.
---------------------------------------------------------------
Version                : 1.42.0
State Home             : /u01/oracle/omc/scpext/cloud/agent_inst
Log Directory          : /u01/oracle/omc/scpext/cloud/agent_inst/sysman/log
Binaries Location      : /u01/oracle/omc/scpext/cloud/190508.0800/core/1.42.0
Process ID             : 3731
Parent Process ID      : 3643
URL                    : https://host.oracle.com:4460/emd/main/
Started at             : 2019-06-24 10:25:27
Started by user        : user
Operating System       : Linux version 3.8.13-118.34.1.el6uek.x86_64 (amd64)
Data Collector enabled : false
Sender Status              : FUNCTIONAL
Gateway Upload Status      : FUNCTIONAL
Last successful upload     : 2019-06-26 08:26:16
Last attempted upload      : 2019-06-26 08:26:16
Pending Files (MB)         : 0.01
Pending Files              : 5
Backoff Expiration         : (none)
Dispatched Work Encryption : ENABLED

---------------------------------------------------------------
Agent is Running and Ready

If there are a lot of pending files, or the Last Check In time in the UI is delayed, try checking the connectivity from agent to OMC with the omcli status agent connectivity command:

./omcli status agent connectivity -verbose

Oracle Management Cloud Agent  
Copyright (c) 1996, 2019 Oracle Corporation.  All rights reserved.
---------------------------------------------------------------

Cloud Agent is Running

No identifiable connectivity issues found between Cloud Agent and Gateway.

---[OMC Pi-----------------------------------------[Time]-[Details]---

    Gateway host.oracle.com                          6ms   HTTP 200 OK
        OMC omc.uscom-1.oraclecloud.com:-1         422ms   HTTP 200 OK
----------------------------------------------------------------------

Check the connectivity status of the Gateway: host.oracle.com

If your agent is down, be sure to check that automatic restart was configured per documentation to be sure it auto-starts after a server restart.

Alerts

Checking on agent status is pretty easy, but you really don’t want to have to check the console every day. Let’s configure an alert to notify you when there are agent issues.

From the main OMC menu, click on Alerts. Then click on Alert Rules.

From Service dropdown, select Agent, then click Create Alert Rule.

Provide a rule name and under Entities select the types of agents you want to alert on. I’m going to select Gateways, Data Collectors and Cloud Agents.

Click on Add condition to select the Availability condition. Then select a notification method. Click on Email Channel to create an Email Channel or select one if you have already created. You can also send to Mobile or configure one of the other notification channels. Finally, click Save.

Additional metric based alerts can be added for each of the agent types to monitor performance and load. To set these, use the service type Monitoring (instead of Agent), when creating your Alert Rule. If you’re interested in making sure that an agent is uploading data regularly, you could monitor the Total Pending Upload Files metric. The Data Collector entity also has an Error Code metric that will provide details for data collector errors that might occur.

Upgrades

With OMC, there is really no excuse for your agents to be out of date or unpatched. Upgrades are easy for these agents as you request an upgrade, and the agent receives instructions to download and upgrade itself. Agent updates are available every month, and you should strive to stay within 1-3 releases of the agent to stay current.

From the Administration > Agents page, you can easily upgrade all of your Gateway, Cloud and Data Collector agents. In the version column, you’ll see a blue circle icon by the version if there’s a new agent version. You can search for a group of agents, or select one by one and click Upgrade. Depending on how many agents, and network connectivity, it can take 5-15 min to complete. Just be sure to start with the Gateway agents first, once that is complete you can move on to the Cloud and Data Collector agents.

To view the status of the agent upgrades, click on the menu icon and select Agent Lifecycle Tasks. You can drill in to investigate any failures from here. If the agent upgrade failed, you will find the logs on the host under the $agent_home/agent_inst/sysman/logs directories. If the agent upgrade failed for a simple reason like lack of space, resolve the issue and retry. If there’s other errors, it’s best to open an SR with Oracle Support to get help resolving the issue.

For more information on the agent upgrades and lifecycle tasks, see the Oracle Documentation links below:

Agent Logs

Agent logs are available on the host under $agent_home/agent_inst/sysman/logs. The key log is gcagent.log, but there are many others that contain valuable information. Support may ask you to generate an agent log bundle. This will collect system statistics and zip all agent logs into a bundle file that can be uploaded to support.

./omcli generate_support_bundle agent /home/oracle

Another option is to load your Agent logs to Log Analytics. This can be done easily by going to Administration > Agents, and clicking on the Action menu for that agent and selecting View Details. You will find a toggle button to enable Log Collection. These logs will be found in Log Analytics and can be searched, clustered and alerted on as any other logs you upload.