Getting to Know EMDIAG: repvfy execute optimize

In my group, we work with a lot of customers with very large EM environments.  On the range of 2000+ agents.  So as you can imagine there’s a little bit of optimizing that needs to get done to account for these numbers.

A few of these standard tweaks have been put into the repvfy execute optimize command.    You can make all these changes individually, but if you want to get them all done at once, optimize is your tool.

There’s 3 categories of optimization that is handled at this point:  Internal Tasks, Repository Settings and Target system.    The script will first evaluate the size of your repository based on the number of agents, and from there determine what optimizations need to be done or recommended for future implementation.

Internal Task Tuning

Enterprise Manager uses short and long workers, depending on the task activity.  We typically recommend 2 workers for each for most larger systems, so in repvfy execute optimize this is what gets set. Smaller systems are usually sufficient with the default settings of 1 each.    You can view the configuration in EM on Manage Cloud Control -> Repository page.   Here you can also configure the short workers, but not the long.  If you see a high collection backlog, this is an indication that your in need of additional task workers.

shortworkers

The next step is to evaluate the current settings of the job system and ensure that there are enough connections available for the job system.  This change is not implemented automatically, but is printed out for you to change with emctl, as it will require a restart to take effect.   Recommendations for Large Job System Load can be found in the Sizing chapter of Advanced Installation Guide.  Increasing the number of connections may require an increase in database processes value.

Repository Settings Tuning

EM tracks system errors in one of it’s tables.   In larger systems, the MGMT_SYSTEM_ERROR_LOG table can become quite large over the 31 day default retention.   The optimize script reduces log retention to 7 days for normal operating.

There are also various levels of tracing enabled by default, this can generate a lot of extra activity during normal operations if you’re not utilizing the traces.    Tracing is turned off by the optimize command.  It can be enabled at any time by using the repvfy send start_trace -name <name>  and repvfy send start_repotrace commands.

Finally this step looks for any invalid SYSMAN objects and validates them, then checks for stale optimizer statistics and makes a recommendation as needed.

System Tuning

After an EM outage or downtime, all the agents will attempt to upload and update their status (or heartbeat) with the OMS.  There’s a grace period in which no alerts are sent.  In larger systems, this grace period may not be long enough to get all agents updated before alerts start going out.   This can be adjusted by increasing that grace period.

In 12.1.0.3 and higher, you can also increase the number of threads that perform the ping heartbeat tasks.  This should be done if you have more than 2000 agents per OMS.  The optimize command will make this calculation for you and recommend the appropriate emctl command to set the heartbeatPingRecorderThreads property.  Recommendations for Large Number of Agents can be found in the Sizing chapter of Advanced Installation Guide.

The optimize command will only output those items that require attention, so not every item will appear in the output on every site.
The recommended values reported in the output are specific for THAT environment  and should not be copied over to another environment just like that.  To tune another EM environment, run the optimize script on that environment.

Sample output from a small EM system:

bash-4.1$ ./repvfy execute optimize

Please enter the SYSMAN password:
SQL*Plus: Release 11.1.0.7.0 – Production on Thu Jul 9 07:59:35 2015

Copyright (c) 1982, 2008, Oracle. All rights reserved.

SQL> Connected.

Session altered.
Session altered.

========== ========== ========== ========== ========== ========== ==========
== Internal task system tuning ==
========== ========== ========== ========== ========== ========== ==========

– Setting the number of short workers to 2 (1->2)
– Setting the number of long workers to 2 (1->2)
========== ========== ========== ========== ========== ========== ==========
========== ========== ========== ========== ========== ========== ==========
== Job system tuning ==
========== ========== ========== ========== ========== ========== ==========

– On each OMS, run this command:
  $ emctl set property -name oracle.sysman.core.conn.maxConnForJobWorkers -value 72 -module emoms
  This change will require a bounce of the OMS

========== ========== ========== ========== ========== ========== ==========
========== ========== ========== ========== ========== ========== ==========
== Repository tuning ==
========== ========== ========== ========== ========== ========== ==========
– Setting retention for MGMT_SYSTEM_ERROR_LOG table to 7 days (31->7)

– Disabling PL/SQL tracing for module (EM.GDS)
– Disabling PL/SQL tracing for module (EM_DBM)

– Disabling repository metric tracing for ID (1234)

– Recompiling invalid object (foo,TRIGGER)
– Recompiling invalid object (bar,CONSTRAINT)

– Stale CBO statistics in the repository. Gather statistics for the SYSMAN schema
  Command to use:
  $ repvfy send gather_stats
  Or:
  SQL> exec emd_maintenance.gather_sysman_stats_job(p_gather_all=>’YES’);

========== ========== ========== ========== ========== ========== ==========
========== ========== ========== ========== ========== ========== ==========
== Target system tuning ==
========== ========== ========== ========== ========== ========== ==========

– Setting the PING grace period to (90) (60->90)

– Set the parameter oracle.sysman.core.omsAgentComm.ping.heartbeatPingRecorderThreads to 3
  $ emctl set property -module emoms -name oracle.sysman.core.omsAgentComm.ping.heartbeatPingRecorderThreads -value 3

========== ========== ========== ========== ========== ========== ==========
not spooling currently

Changing OMS or Agent Properties in OEM Console

There was a thread going on earlier this week about how to change a value for an OMS property setting.   This is typically done when working with support to adjust a timing or enable debug or tracing.

The most common way is using emctl set property at command level.

$ emctl set property -name oracle.sysman.eml.maxInactiveTime -value 60 

To get the current setting there’s a emctl get property command:

$ emctl get property -name oracle.sysman.eml.maxInactiveTime

That usually works great if support has just given you the exact command to run, or you’re using a MOS note to reference the exact syntax.   However, if you’re getting older like me, and syntax is just one of those things that you tend to put in the way back corners of your brain… you tend to forget was it oracle.sysman.eml.maxInactiveTime or was it oracle.em.sysman.maxTimeout or…   There’s just too many to remember.   Lucky for us, there is now a place to view and set these properties in the EM  console.   You can find it under Setup / Manage Cloud Control / Management Services.

properties_1

Then under the Management Servers menu select Configuration properties.

properties_2

From here you’ll get a window that lists the non-default properties.  Understand, that some properties will not show up, that doesn’t mean they are not set, but that they just have the system default value.

properties_3

 

By switching the Show view to All, you’ll see a larger list of properties.  Not all of them are modifiable, as indicated by the lock icon.   If you want to view more information about a property, or modify it, click on the Name.    This will bring up a new window, with the ability to modify the value and save.  This view will also tell you whether the property is Dynamic (can be changed without OMS restart).   If you expand the Change History, you can also view the previous changes for this parameter.

properties_4

Of course, not all properties are OMS based, so there’s an equivalent option on the Agent side.  From the Agent home page, click on Agent menu then select Properties.

properties_9

You will get a list of properties, some which you can edit, some you can’t.   By default the basic properties are shown.

properties_10

Select Advanced Properties to see additional agent properties such as dynamicPropsComputeTimeout which is often adjusted on very large servers.

properties_11

This is great right?  But what if you want to change the same property on 1000 agents?   Well, that’s in here too!  Click on Setup / Manage Cloud Control / Agents.

properties_5

From there, click on the agents (or just one for now) and click Properties.properties_6

This will start a job wizard in which you can add additional agents by clicking Add in the Targets section.  Then click on the Parameters tab.

properties_7

Now you can set the parameter value that you wish to push out to all selected agents.

properties_8

The caveat — use with caution and common sense.  There’s a lot of parameters in here, and very little are documented, some should not be changed unless directed by Oracle Support.   So don’t go cowboy on us and start tweaking them all just to see what they do!

Getting to Know EMDIAG – repvfy show score_card

Continuing on with a series of EMDIAG commands that I find useful, today we’re going to look at the repvfy show score_card.   You may also want to review the previous post on repvfy diag all.

One of the main functions of repvfy is the verify function (i.e. repvfy verify -details -level 9).  This goes through a long list of checks to identify areas you might need to investigate.  This could be anything from Agents with Clock Skew to Expired User accounts.

Now there’s a scorecard (show score_card) that reads the output of the verify log and will summarize the category of errors that were found, and a generate a score based on the weight, number of tests and number of violations.  This can help you in tracking improvements as you work through any cleanup and issues.

repvfy show score_card (also try score_card_details)

Category                  Score  #Tests #Viol
————————- —— —— ——
Best Practice              81.59      6    780
Configuration              21.64     20   2298
Data Integrity             25.09     29   2793
Monitoring/Operations      60.71     23    886
————————- —— —— ——

The output of show score_card_details will give the breakdown by category (Best Practice, Configuration, Data Integrity, Monitoring/Operations) and Module (Targets, Agents, etc) along with the test to run (i.e. repvfy verify targets -test 6002).

Note: The sample output below has been modified and shortened to fit the blog.

$ repvfy show score_card_details

Category        Rank Module          ID   Test               Score  #Viol
BP  1 TARGETS 6002 OMS mediated targets without backup Agent   10.0  639
BP  2 AGENTS  6006 Deployed Agent plugins lower than OMS plugin 6.6  106
BP  3 REPO  6039 Newer version avail for deployed OMS plugin    1.1    9
BP  4 REPO  6005 Tables with locked statistics                  0.3   23
BP  5 JOBS  6006 Job steps running more than two hours          0.3    2

Using the repvfy verify -details will help you identify targets to investigate.  Some tests have automated fixes that can be run with repvfy verify <module> -test <test#> -fix.     Other issues may have manual steps or suggestions, some may require opening an SR with Oracle Support.  After fixing issues, rerun the full repvfy verify -level 9 and regenerate the score card to track your progress!

Enterprise Manager Patching FAQ

Every time a new plug-in, patch or PSU comes out for Enterprise Manager, I get a series of questions.   I’m going to quickly address a few of the ones I’ve gotten this month with the release of the new plug-ins and the PSU patches to try and help you in updating your EM site. I wrote a detailed blog EM Patching 101 about the various patches, and that’s still relevant so read if you haven’t (or read it again)!

What’s the difference between the Plug-in release (released 1/5/2015) and the Plug-in system bundle (released 12/31/2014)?  

The plug-in system bundle will come out every month on the last day of the month.  The next one is scheduled for 1/31/2015.   This is a cumulative compilation of monthly plug-in bundle patches.   So it will look at what plug-ins you have installed on the OMS, and it will deploy the latest plug-in patch bundle to that plug-in.  It will skip what you don’t have installed.   These patches will include fixes and will be indicated by the 5th digit in the version.  For example, the latest Database plug-in patch was 12.1.0.6.7.

The plug-in updates released on 1/5 were full plug-in version upgrades.  These need to be deployed to the OMS by using Extensibility -> Plug-ins and deploy to OMS.  They may require downtime.  Plug-in updates for Database, Fusion Middleware, Storage Management and Cloud were released.  These updates include new features and fixes.   For example, the Database plug-in is now 12.1.0.7.0.

Which one should I apply?

My recommendation would be to test and apply both.   The plug-in updates for Database and other components you use have the latest features and code that will enable new features or improvements.   You will also want to apply the latest system bundle patch (whether it’s December’s or January’s) to patch the other plug-ins that did not get updated (MOS for example).

Should I deploy the new Plug-ins first or apply patches first?

My suggestion is to apply the Plug-in updates first.  The plug-ins released in January include Database, Fusion Middleware, Cloud and Storage Management plug-ins.  When you download and apply these to the OMS, and Agents if required, they will not need the patches from the December 31st patch bundle, and you’ll have the most recent update. After deploying the plug-in updates, when you apply the Plug-in System patch (20188140) you will see that the updated plug-ins (DB, MW, etc) are skipped.

How often should I patch EM?   Agents?

I recommend to most my customers to stick to a quarterly patching cycle, and this typically follows the PSU cycle.  So start looking at the patches that are released in the January PSU and testing them.  At that time, grab all the recommended or latest plug-in and agent patches, and apply those as well.   The agent and plug-in patches will come out monthly, and if you need a patch for a particular bug or issue, then you should apply.  However, patching EM monthly is not feasible for most customers.

What patches do I need to apply as of now (January 2015)?

As mentioned in the previous Patching 101 blog, there’s multiple components involved that each have their own patch. So here’s what I’d recommend if you were my customer deploying or updating this month:

  • Latest Database PSU Patch (1/15) for your repository version
  • OMS 12.1.0.4.2 PSU Patch – 19830994
  • OMS 12.1.0.4.7 Plug-in System Patch – 20188140
  • Weblogic 10.3.0.6.0.10 PSU Patch – 19637463
  • WebLogic Server one-off recommended – 16420963
  • OPMN Patch (CVE-2014-4212) – 19345576
  • Oracle Help Patch (CVE-2015-0426) – 20075252
  • Agent 12.1.0.4.4 Bundle – 20109357
  • Agent Plugin Patches for non-Updated plug-ins as necessary – From Doc ID 1900943.1  (Siebel, OVI, Exadata)

When I apply Weblogic PSU it has a conflict, can I roll it back and how do I do it?

When you follow instructions to apply the PSU, you may receive a warning that there’s a conflict and patches are mutually exclusive such as the following:

$ ./bsu.sh -install -patch_download_dir=<MW_HOME>/utils/bsu/cache_dir -patchlist=12UV -prod_dir=<MW_HOME>/wlserver_10.3
Checking for conflicts..
Conflict(s) detected - resolve conflict condition and execute patch installation again
Conflict condition details follow:
Patch 12UV is mutually exclusive and cannot coexist with patch(es): FSR2

You must first deinstall the conflict patch first, and install the new PSU.  The patches are cumulative.

$ ./bsu.sh -remove -patchlist=FSR2 -prod_dir=<MW_HOME>/wlserver_10.3
Checking for conflicts..
No conflict(s) detected
Removing Patch ID: FSR2..
Result: Success

Now you can install the new PSU:

$ ./bsu.sh -install -patch_download_dir=<MW_HOME>/utils/bsu/cache_dir -patchlist=12UV -prod_dir=<MW_HOME>/wlserver_10.3
Checking for conflicts..
No conflict(s) detected
Installing Patch ID: 12UV..
Result: Success

How do I apply the Oracle Help Patch (20075252)?

The January PSU calls for 17617649 on the Oracle Help product but this conflicts with a pre-existing EM patch.   There is a merge patch 20075252 that is available for this.  The Oracle Help patch is applied to the <MW_HOME?/oracle_common and uses the OPatch from that directory as well.    Read the readme.txt for full patching steps, but some tips are listed here in applying.

emctl stop oms -all
export MW_HOME=<MW_HOME>
export ORACLE_HOME=$MW_HOME/oracle_common
export PATH=$ORACLE_HOME/OPatch:$PATH

You should be able to do an opatch lsinventory and get the output of the ORACLE_HOME and the opatch apply should be similar to below:

How do I apply the OPMN Patch (19345576)?

Following the readme.txt it wants you to set Oracle Home to the  Classic/WebTier home.  For EM, this is the MW_HOME/Oracle_WT directory.

emctl stop oms -all
export MW_HOME=<MW_HOME>
export ORACLE_HOME=$MW_HOME/Oracle_WT
export PATH=$MW_HOME/oracle_common/OPatch:$PATH

You should be able to do an opatch lsinventory and get the output of the ORACLE_HOME and the opatch apply should be similar to below:

Let me know if you have any other patching questions and I’ll add them here!

Patching 101 – The User Friendly Guide to Understanding EM Patches

There was a conversation on twitter last week about available patches for Enterprise Manager (EM) 12.1.0.4, and it got a little deeper than 140 characters will allow.  I’ve written this blog to give a quick Patching 101 on the types of EM patches you might see and the details around how they can be applied.

Read original post here