SMS Notifications From Enterprise Manager

Starting with Enterprise Manager (EM) 12c, you have the choice to send SMS notifications to a cell phone, or a pager (does anybody still use pagers?).  There’s been a couple questions on the forums about this so I thought I’d write it up since it appears to be a bit confusing as to how this works.

First, be sure that your Mail Server is setup in Setup / Notifications / Notification Methods and you can receive the test e-mail.

sms7

Next, create an EM administrator user, then login as that user to update your e-mail address and SMS/Pager information by clicking on the Username drop down menu and selecting Enterprise Manager Password & Email.

sms6

For the SMS/Pager, you need the text based address.  So if your provider was Verizon, it would be 8885555555@vtext.com.    Select the Email Type Pager for SMS messages as they are shorter than the Email format.   It’s important to note, that you will not see the multiple lines in the Setup / Security / Administrators view.  You can enter multiple e-mails separated by commas, but the E-mail Type option will not be available.

sms5

By default, both email and pager will be enabled in your Notification Schedule, you may adjust this as necessary by going to Setup / Notifications / My Notification Schedule.   You can receive notifications by both e-mail/page, just e-mail, or just page depending on what you configure here.

Next, you need to create an Incident Rule Set, or edit an existing one.   From Setup / Incidents / Incident Rules, select a rule set and rule to edit.  Once you get to the Action for the Rule, in the Basic Notifications section, select the EM Administrator in the Page box.  Save all changes.

sms8

In 12cR4, you can test the Incident Rule Set by selecting your Rule Set and clicking the Simulate Rules button (Setup / Incidents /  Incident Rules).  You will need to select a Target, Event Type and find an alert to simulate.  Then you will get a list of Actions that the Incident Rule Set will perform for this alert.

To test my notifications, I dropped the warning threshold on the tablespace metric to 15, something I know would trigger immediately.    Here’s the messages I received on my phone.

sms4

I also received the long format in my e-mail.  When you’re done testing, don’t forget to set your thresholds back!

 

Introducing Oracle Management Cloud…

Oracle Management Cloud

If you were at Oracle OpenWorld this year you might have had the chance to preview Oracle’s latest cloud service – Oracle Management Cloud (OMC).   The three OMC services launched this month are Application Performance Monitoring (APM), Log Analytics (LA) and IT Analytics (ITA).    The goal for OMC is to bring together different types of operational data for use by both businesses and IT.  All data is stored in a unified data platform that allows you to navigate from one service to another while working on specific use cases.   Here’s a quick overview of each service to get you up to speed.

Application Performance Monitoring

This service gives you insight into the end user performance and experience, from  browser statistics to  AJAX calls.  With statistics on page load times and errors, you can drill down to find errors, see the request calls, and review memory usage and garbage collection.

omcla4

The integration with Log Analytics allows the user to drill down to server and application logs related to the poor performance time periods.  By using saved searches and creating custom dashboards you can see the information that’s important to your application in one view.

omsla

Log Analytics

Upload all your logs to the cloud – database, application, middleware, server, infrastructure – then search and explore to identify problems or resolve issues.   Troubleshoot problems by exploring logs in context of the application using topology-aware log exploration. Then utilize the cluster feature to identify outliers or frequently occurring patterns.  There’s always a lot of noise in log files, so this allows you to filter out the noise and get to the real errors that you need to see.

la

Another key feature is the ability to correlate other logs within a time period.  Let’s say you found an error at 1:00, you can then see the log entries 1 minute before and after on different systems to identify any correlated log entries.    LA also includes the ability to save queries and use them in a dashboard so you can replicate your searches with ease.  Not to mention storing logs in the Oracle cloud instead of on your servers for long periods of time.

IT Analytics

Finally a way to look at IT resources holistically and see how targets in your environment compare to one another.  Resource Analytics will allow you to view current utilization as well as forecast things like storage and CPU across targets, or groups of targets. Answer questions like how much storage will we need in 6 months?  How about what databases are consuming the most CPU and how much have they increased recently?   itacpu1

With Performance Analytics you can identify bottlenecks across Database or Middleware targets.  Have you ever wondered what the worst SQL in your environment is?  I had a manager once who really liked to point out the worst performing SQLs across applications, how easy this will be now!

Probably the most exciting feature is the Data Explorer which allows you to create custom queries and save them as a dashboard that you can create for your unique requirements.   Below I’ve searched for database instances based on their DB Wait time, hovering over the chart you get a popup with the target, type of wait and value.  I can now save this widget to a dashboard if I like.itawaits

Learn More

More information on OMC can be found in this ebook from Oracle and on the Oracle Cloud website.   You can also watch the launch video here.   I will be posting more about each of the services over the next few weeks, as well as sharing my experiences with our first customer implementations, so stay tuned!

Automating the Mundane with Corrective Actions and Oracle Enterprise Manager

In my opinion, one of the most under-utilized features of Oracle Enterprise Manager is the Corrective Actions that can be triggered when a metric alert threshold is crossed.  I think one reason it’s under-utilized is it’s very hard to think about where to start and what can be automated.  My advice from previously implementing, is to look at the alerts that are generated, or tickets, and determine which ones are most frequent and mundane.  The one that always comes to mind is Archive Destination.

Archive was the first CA we implemented at my previous company, because we got at least 20 per day.  Since our backups were controlled by a different team, all we could do was cut a ticket to them, and possibly kick off an archive backup hoping it would complete this time.  So we put this in a Corrective Action.  The script checked for hung backups sessions, checked that a backup wasn’t already running, looked through a config file to get the right information, and then kicked off an archive backup job.   Then it sent an email to a ticketing queue for the backup team to generate the ticket so they could investigate why it was failing.    We set the CA to run on Warning, with a fairly low threshold so we had plenty of time to react if it got to Critical.  This was such a success that we went on to write more, including automating tablespace adds and sending notifications to the application teams when process/sessions was exceeded.

A friend of mine, Tyler Sharp, recently started implementing Corrective Actions and has found tremendous time savings.    He recently had the idea to automate oradebug steps so they would always have the required debug when working with Oracle Support, instead of having to go through the process manually the next time around.   The CA is triggered when > 4 active sessions waiting on concurrency, or over 900 seconds db blocking time.  He was kind enough to share the script they’ve implemented below:

conn / as sysdba
set serverout on

DECLARE
   trace_name   VARCHAR2 (1000) := NULL;
   alter_session   VARCHAR2 (1000) := NULL;

BEGIN
   SELECT to_char (SYSDATE, ‘MMDDYY_HHMISS’) INTO trace_name FROM DUAL;
   alter_session :=
         ‘alter session set tracefile_identifier=”’||trace_name||’_AUTO_HANGANALYZE”’;
   DBMS_OUTPUT.PUT_LINE (alter_session);
   EXECUTE IMMEDIATE alter_session;
END;
/

oradebug setmypid;
oradebug unlimit;
oradebug dump ashdumpseconds 5
oradebug hanganalyze 3
execute sys.dbms_lock.sleep(180);
oradebug hanganalyze 3

As you can see, they set a trace file name, take an ashdump, then do the required hanganalyze twice with a  sleep in between.   Now the DBA can skip these steps when working an issue, and collect the files that were created at the right time, not 10 minutes later.   You’ll need to be sure to have a credential that has sysdba access to run this properly.

The great thing about Corrective Actions, is you can use them in a template so you can push them to all servers to keep your resolutions standard.    The Corrective Action is triggered for either Warning or Critical, or both.  Then you have the choice to get notified of that alert right away, or bypass notifications unless the Corrective Action failed.   This allows you a fall back in case the script or job has a problem fixing the issue.

To learn more about Corrective Actions, check the Oracle Enterprise Manager 12c Cloud Control Administrator’s Guide and check out the following blog posts for more ideas!

What are the Corrective Actions you’ve implemented, or would like to implement in your environment?