For Enterprise Manager, the Agent plays a big role. The Agent is in charge of gathering all information from the targets and performing administrative tasks and jobs. To do it’s job, the Agent has to be healthy. In this post, we’re going to take a look at some checks that you can perform to make sure that your Agents and Targets are monitored and reporting properly.
Let’s take a look at what we can find in the Agents page. In my opinion, this is a page the main EM Administrator should be checking every day. They should know what targets aren’t uploading, what Agents are down or have been blacked out for months. From Setup / Manage Cloud Control / Agents we get the view below:
First we have the ability to filter our view by Status. We can view All, Up, Down, Blackout or Unreachable. We can also look for agents that are Blocked or Misconfigured.
Agents are secured during install by default, so if there’s any No’s in the Secure Upload column, they should be resecured.
There are various statuses that you’ll find here. The most common needing investigation will be Unreachable or Blocked.
Unreachable agents will have an additional icon for Diagnostic Analysis. This will run some checks to help determine why an agent is unreachable.
The Last Successful Load can tell us a lot about an agent and it’s targets. If you sort by this column, and find agents that haven’t uploaded for months or maybe even years, it’s a sign that nobody’s paying attention, and possibly that the target has been decommissioned. If you can’t find a contact for these, then maybe a blackout is in order.
I think the next two columns get overlooked often.
The first is monitored targets. This is telling us how many targets (including host and agent) are being managed. Now, every agent will have 3 at a minimum (agent, host, agent Oracle Home). So if you see agents that have 2 or 3, that’s a big clue to start investigating just what are you monitoring on that server. Is it new, or has it been decommissioned and not cleaned up properly? The Broken Targets column is another key to diagnosing issues. If there’s anything but a 0 in this column, it deserves some attention to find out exactly what is broken and why.
Just like any other target in EM, you’ll find a wealth of information is collected about the agent target in Monitoring / All Metrics. Each category can have multiple metrics collected. You will find information here on certificate expiration, disk usage, memory usage, cpu usage,
When you click on the Category, the right-hand pane will show real-time data. This can be very helpful in troubleshooting metric issues. You will see the last upload time. Metrics are uploaded every 15 minutes, so if it’s older than 15-20 min, the agent is having a problem.
Some of the metrics are very helpful, specifically in the EMD… categories. In these metrics you’ll find information about agent and OMS communication (heartbeat, ping, and upload). Take some time to review the metrics that are collected and available on your agents.
There are a couple of Information Publisher reports that can be useful in identifying problems on Agents and other targets. From Enterprise / Reports / Information Publisher, the first reports are under the Enterprise Manager Health category.
Agent Clock Synchronization Offset will give you the agents whose system clock may be offset from the Repository system clock based on the timestamp of the last heartbeat recorded. As many pieces of EM rely on the timestamp, an offset more than a few minutes can cause problems.
Agents in Questionable State shows Agents which are in Metric Error state, Agent Down state and Pending/Unknown state for the last 24 hours.
Agents Restarted shows the Agents which have restarted in the last 24 hours. If an Agent is consistently on this report, it’s a sign that it’s crashing.
Broken Targets is another view of the targets we looked at earlier, where they are misconfigured or not monitored for some reason. This is a good report to have emailed out regularly, to be sure you know what targets are having problems.
Targets Not Uploading Data is another one that EM Administrators should regularly view. This is the report of the targets not uploading consistently, and the last upload date.
Another set of reports under the Target Status Diagnostics category, will help in diagnosing issues with specific targets. Agent-based targets are going to be your Host, Database Instance, Listener. Repository-based targets will be Cluster, Cluster Database, System, etc. In a future post, I’ll break down these reports further.
In part 2 of this blog, we’ll take a look at using EMDIAG’s AGTVFY scripts, as well as looking at some of the Agent log files and common errors. Stay tuned!