Note: Proofread any scripts before using. Always try scripts on a test instance first. This Blog is not responsible for any damage.
What is Data Guard?
Data Guard, which was introduced as the standby database in Oracle database Version 7.3 under the name of Data Guard with Version 9i , is a data protection and availability solution for Oracle databases. The basic function of Oracle Data Guard is to keep a synchronized copy of a database as standby, in order to make provision, incase the primary database is inaccessible to end users. These cases are hardware errors, natural disasters, and so on. Each new Oracle release added new functionalities to Data Guard and the product became more and more popular with offerings such as data protection, high availability, and disaster recovery for Oracle databases.Using Oracle Data Guard, it's possible to direct user connections to a Data Guard standby database automatically with no data loss, in case of an outage in the primary database. Data Guard also offers taking advantage of the standby database for reporting, test, and backup offloading. Corruptions on the primary database may be fixed automatically by using the non-corrupted data blocks on the standby database. There will be minimal outages (seconds to minutes) on the primary database in planned maintenances such as patching and hardware changes by using the switchover feature of Data Guard, which changes the roles of the primary and standby databases. All of these features are available with Data Guard, which doesn't require an installation but a cloning and configuration of the Oracle database.
A Data Guard configuration consists of two main components: primary database and standby database. The primary database is the database for which we want to take precaution for its inaccessibility. Fundamentally, changes on the data of the primary database are passed through the standby database and these changes are applied to the standby database in order to keep it synchronized.
The following figure shows the general structure of Data Guard:
Let's look at the standby database and its properties more closely.
Standby database
It is possible to configure a standby database simply by copying, cloning, or restoring a primary database to a different server. Then the Data Guard configurations are made on the databases in order to start the transfer of redo information from primary to standby and also to start the apply process on the standby database.
Primary and standby databases may exist on the same server; however, this kind of configuration should only be used for testing. In a production environment, the primary and standby database servers are generally preferred to be on separate data centers.
Data Guard keeps the primary and standby databases synchronized by using redo information. As you may know, transactions on an Oracle database produce redo records. This redo information keeps all of the changes made to the database. The Oracle database first creates redo information in memory (redo log buffers). Then they're written into online redo logfiles, and when an online redo logfile is full, its content is written into an archived redo log.
An Oracle database can run in the ARCHIVELOG mode or the NOARCHIVELOG mode. In the ARCHIVELOG mode, online redo logfiles are written into archived redo logs and in the NOARCHIVELOG mode, redo logfiles are overwritten without being archived as they become full. In a Data Guard environment, the primary database must be in the ARCHIVELOG mode.
In Data Guard, transfer of the changed data from the primary to standby database is achieved by redo with no alternative. However, the apply process of the redo content to the standby database may vary. The different methods on the apply process reveal different type of standby databases.There were two kinds of standby databases before Oracle database Version 11 g , which were: physical standby database and logical standby database. Within Version 11 g we should mention a third type of standby database which is snapshot standby. Let's look at the properties of these standby database types.
Physical standby database
The Physical standby database is a block-based copy of the primary database. In a physical standby environment, in addition to containing the same database objects and same data, the primary and standby databases are identical on a block-for-block basis. Physical standby databases use Redo Apply method to apply changes. Redo Apply uses Managed recovery process ( MRP ) in order to manage application of the change in information on redo.In Version 11 g , a physical standby database can be accessible in read-only mode while Redo Apply is working, which is called Active Data Guard. Using the Active Data Guard feature, we can offload report jobs from the primary to physical standby database.
Physical standby database is the only option that has no limitation on storage vendor or data types to keep a synchronized copy of the primary database.
Logical standby database
Logical standby database is a feature introduced in Version 9 i R2. In this configuration, redo data is first converted into SQL statements and then applied to the standby database. This process is called SQL Apply. This method makes it possible to access the standby database permanently and allows read/write while the replication of data is active. Thus, you're also able to create database objects on the standby database that don't exist on the primary database. So a logical standby database can be used for many other purposes along with high availability and disaster recovery.Due to the basics of SQL Apply, a logical standby database will contain the same data as the primary database but in a different structure on the disks.
One discouraging aspect of the logical standby database is the unsupported data types, objects, and DDLs. The following data types are not supported to be replicated in a logical standby environment:
- BFILE
- Collections (including VARRAYS and nested tables)
- Multimedia data types (including Spatial, Image, and Oracle Text)
- ROWID and UROWID
- User-defined types
Snapshot standby database
Principally, a snapshot standby database is a special condition of a physical standby database. Snapshot standby is a feature that is available with Oracle Database Version 11g . When you convert a Physical standby database into a snapshot standby database, it becomes accessible for read/write. You can run tests on this database and change the data. When you're finished with the snapshot standby database, it's possible to reverse all the changes made to the database and turn it back to a physical standby again.An important point here is that a snapshot standby database can't run Redo Apply. Redo transfer continues but standby is not able to apply redo.
Oracle Data Guard evolution
It has been a long time that the Oracle Data Guard technology has been in the database administrator's life and it apparently evolved from the beginning until 11 g R2. Let's look at this evolution closely through the different database versions.Version 7.3 – stone age
The functionality of keeping a duplicate database in a separate server, which can be synchronized with the primary database, came with Oracle database Version 7.3 under the name of standby database. This standby database was constantly in recovery mode waiting for the archived redo logs to be synchronized. However, this feature was not able to automate the transfer of archived redo logs. Database administrators had to find a way to transfer archived redo logs and apply them to the standby server continuously. This was generally accomplished by a script running in the background.The only aim of Version 7.3 of the standby database was disaster recovery. It was not possible to query the standby database or to open it for any purpose other than activating it in the event of failure of the primary database. Once the standby database was activated, it couldn't be returned to the standby recovery mode again.
Version 8i – first age
Oracle database Version 8 i brought the much-awaited features to the standby database and made the archived log shipping and apply process automatic, which is now called managed standby environment and managed recovery, respectively. However, some users were choosing to apply the archived logs manually because it was not possible to set a delay in the managed recovery mode. This mode was bringing the risk of the accidental operations to reflect standby database quickly.Along with the "managed" modes, 8 i made it possible to open a standby database with the read-only option and allowed it to be used as a reporting database.
Even though there were new features that made the tool more manageable and practical, there were still serious deficiencies. For example, when we added a datafile or created a tablespace on the primary database, these changes were not being replicated to the standby database. Database administrators had to take care of this maintenance on the standby database. Also when we opened the primary database with resetlogs or restored a backup control file, we had to re-create the standby database.
Version 9i – middle age
First of all, with this version Oracle8 i standby database was renamed to Oracle9 i Data Guard. 9 i Data Guard includes very important new features, which makes the product much more reliable and functional. The following features were included:- Oracle Data Guard Broker management framework, which is used to centralize and automate the configuration, monitoring, and management of Oracle Data Guard installations, was introduced with this version.
- Zero data loss on failover was guaranteed as a configuration option.
- Switchover was introduced, which made it possible to change the roles of primary and standby. This made it possible to accomplish a planned maintenance on the primary database with very less service outage.
- Standby database administration became simpler because new datafiles on the primary database are created automatically on standby and if there are missing archived logs on standby, which is called gap; Data Guard detects and transmits the missing logs to standby automatically.
- Delay option was added, which made it possible to configure a standby database that is always behind the primary in a specified time delay.
- Parallel recovery increased recovery performance on the standby database.
- Logical standby database was introduced, which we've mentioned earlier in this article
- Three data protection modes were ready to use: Maximum Protection, Maximum Availability, and Maximum Performance, which offered more flexibility on configuration
- The Cascade standby database feature made it possible to configure a second standby database, which receives its redo data from the first standby database
Version 10g – new age
The 10g version again introduced important features of Data Guard but we can say that it perhaps fell behind expectations because of the revolutionary changes in release 9i . The following new features were introduces in Version 10g :- One of the most important features of 10g was the Real-Time Apply. When running in Real-Time Apply mode, the standby database applies changes on the redo immediately after receiving it. Standby does not wait for the standby redo logfile to be archived. This provides faster switchover and failover.
- Flashback database support was introduced, which made it unnecessary to configure a delay in the Data Guard configuration. Using flashback technology, it was possible to flash back a standby database to a point in time.
- With 10g Data Guard, if we open a primary database with resetlogs it was not required to re-create the standby database. Standby was able to recover through resetlogs.
- Version 10g made it possible to use logical standby databases in the database software rolling upgrades of the primary database. This method made it possible to lessen the service outage time by performing switchover to the logical standby database.
- Fast-start failover automated and accelerated the failover operation when the primary database was lost. This option strengthened the disaster recovery role of Oracle Data Guard.
- Guaranteed restore point was not actually a Data Guard feature. It was a database feature, which made it possible to revert a database to the moment that Guaranteed restore point was created, as long as there is sufficient disk space for the flashback logs. Using this feature following scenario became possible: Activate a physical standby database after stopping Redo Apply, use it for testing with read/write operations, then revert the changes, make it standby again and synchronize it with the primary. Using a standby database read/write was offering a great flexibility to users but the archived log shipping was not able to continue while the standby is read/write and this was causing data loss on the possible primary database failure.
Version 11g – modern age
Oracle database version 11g offered the expected jump in the Data Guard technology, especially with two new features, which are called Active Data Guard and snapshot standby. The following features were introduced:- Active Data Guard has been a milestone in Data Guard history, which enables a query from a physical standby database while the media recovery is active.
- Snapshot standby is a feature to use a physical standby database read/write for test purposes. As we mentioned, this was possible with 10g R2 Guaranteed restore point feature but 11g provided the continuous archived log shipping in the time period that standby is read/write with snapshot standby.
- It has been possible to compress redo traffic in a Data Guard configuration, which is useful in excessive redo generation rates and resolving gaps. Compression of redo when resolving gaps was introduced in 11g R1 and compression of all redo data was introduced in 11g R2.
- Use of the physical standby databases for the rolling upgrades of database software was enabled, aka Transient Logical Standby.
- It became possible to include different operating systems in a Data Guard configuration such as Windows and Linux.
- Lost-write, which is a serious data corruption type arising from the misinformation of storage subsystem on completing the write of a block, can be detected in an 11g Data Guard configuration. Recovery is automatically stopped in such a case.
- RMAN fast incremental backup feature "Block Change Tracking" can be run on an Active Data Guard enabled standby database.
- Another very important enhancement in 11g was Automatic Block Corruption Repair feature that was introduced with 11g R2. With this feature, a corrupted data block in the primary database can be automatically replaced with an uncorrupted copy from a physical standby database in Active Data Guard mode and vice versa.
Oracle Data Guard architecture
The main architecture of Oracle Data Guard 11g R2 includes a primary database, up to 30 standby databases, the redo transport services, (which automatically ship the redo log data from the primary to standby server), and Apply Services (which applies the changes in redo on the standby database). There are of course some background processes special to a Data Guard configuration, which run the services in question.In a Data Guard configuration, the switchover and failover concepts are also very important. By performing a switchover, it's possible to change the roles of the primary and standby databases and change the direction of the redo shipping. Failover is the option that we must use to open a standby database to user connection in read/write mode, when the primary database is inaccessible.
The last Data Guard components that we'll mention in this article are user interfaces to monitor and administrate a Data Guard configuration. These are SQL*Plus, Oracle Enterprise Manager Cloud Control, and Data Guard broker command-line interface (DGMGRL).
Data Guard services
These services are the vital points of a Data Guard configuration. Database administrators should decide and use the proper configuration to supply the business needs and tune these services to comply with SLAs.Redo transport services
In a primary database, when a user commits a transaction, the relevant redo data is written into online redo logfiles from memory (Redo Log Buffer). After the online redo log group becomes full it is archived into an archived redo logfile with a log switch. It's possible to configure Data Guard sending the redo data to standby databases from the log buffer as the transactions are committed (by LGWR process) or from the online redo logfiles when they're being archived (by ARCn processes). Shipping redo data with ARCH will result in more data loss in the case of primary database failure because the data change information in the current online log of primary will be lost.The following diagram shows the Data Guard configuration with ARCH transportation mode:
- Logs are sent by the ARCH process; the LNS process is not in use
- Standby redo logs are not mandatory on the standby database
- Data in the unarchived online redo log will be lost in a failover
The following diagram shows the Data Guard configuration with LGWR and SYNC transportation mode:
- Redo is read and sent to the standby database directly from the log buffer by the LNS process
- Acknowledgment needed from the standby database (RFS to LNS and LNS to LGWR) to send COMMIT ACK to the database user
- It's mandatory to use standby redo logs
- Zero data loss in failover can be guaranteed with this configuration
- There maybe slower response times on the primary database
- The primary database stops giving service in a network disruption incident between primary and standby
If SYNC redo transport is chosen in an 11 g Data Guard configuration, the performance decrease on the primary database will be less than the earlier releases. Previously, the primary database used to finish writes to the online redo log first and then send redo to the standby database. There were two consecutive I/O operations that the primary database needs to wait for in order to complete the commit. In 11g these two I/O operations run in parallel. The primary database does not wait for finishing writes to online redo log and it sends the redo data to standby at the same time.
The other option is to use the Asynchronous redo transport ( ASYNC ) method, which avoids the impact to primary database performance. In this method, the primary database never waits for any acknowledgment from the standby database in order to complete the commit. In the ASYNC redo transport method we have the performance gain; however, this method does not guarantee zero data loss failovers because it does not guarantee all the committed transactions being received by the standby database at any moment.- No acknowledgment needed from standby to send the COMMIT ACK to the database user
- Redo is read and sent to standby from the Redo Log Buffer or online redo logs by the LNS process. If LNS cannot catch the send data in the Redo Log Buffer before it is recycled, it automatically reads and sends redo data from the online redo log.
- The committed transactions that weren't shipped to standby yet, may be lost in a failover
- Potential slower response time on primary database with SYNC mode is not valid here
Protection modes
Data Guard offers three data protection modes, which serve different business needs in terms of data protection and performance. You can find the properties of these modes in the following comparison table:Mode | Redo transport | Action with no standby database connection | Risk of data loss |
Maximum Protection | SYNC and LGWR | The primary database needs to write redo to at least one standby database. Otherwise it will shut down. | Zero data loss is guaranteed. |
Maximum Availability | SYNC and LGWR | Normally works with SYNC redo transport. If the primary database cannot write redo to any of its standby databases, it continues processing transactions as in ASYNC mode. | Zero data loss in normal operation, but not guaranteed. |
Maximum Performance | ASYNC and LGWR/ARCH | Never expects acknowledgment from the standby database. | Potential for minimal data loss in a normal operation. |
Apply services
Data Guard automatically transfers redo data from the primary to standby database and applies it on the standby database. Redo transport services work independent of apply services and never wait for Redo Apply but if there's a problem on redo transportation, apply services normally stop and wait for the new redo to arrive. The most important categorization in apply services is the Redo Apply and SQL Apply. These apply methods create the infrastructure of physical and logical standby databases.As a property of Data Guard, both in Redo Apply and SQL Apply, the standby database validates the redo data in order to prevent physical corruptions that may occur at the primary database from reflecting to the standby database. By default, the standby database writes received redo data into the standby redo logfiles and apply services do not apply redo until the standby redo log is archived as an archived redo log. If we use the real-time apply feature, which became available with 10 g , the apply services don't wait for the archival operation and apply the redo data as it's received and written into the standby redo logs.
It's also possible to specify a delay value to keep the standby database behind the primary database with the specified minutes. This may be chosen to prevent human error operations on the primary database to be applied to standby immediately. However, as we discussed previously, after the support of flashback database, there's no need to define a delay in Data Guard configuration.
Redo Apply (physical standby databases)
Redo Apply keeps a block-by-block copy of the primary database. By default, Redo Apply automatically runs a parallel apply processes, which is equal to the number of CPUs of the standby database server minus one. These parallel recovery processes are controlled by the MRP process, which is the background process responsible for the application of redo data.Redo Apply has the following benefits for its users:
- There are no unsupported data types, objects, and DDLs
- Redo Apply has higher performance when compared with SQL Apply or any other replication solutions
- It offers simple management by keeping the database structure exactly the same as the primary database with its fully automated architecture
- It's possible to take advantages of Active Data Guard and snapshot standby for reporting and testing
- Backups taken from physical standby databases are ready to be restored to primary. So we can offload the backup from primary
- Redo Apply offers a strong corruption detection and prevention mechanism.
- It's possible to use physical standby databases for the rolling upgrades of the database software, which is known as transient logical standby
- The real-time apply feature applies the redo as it's received. This feature makes it possible to query real-time or near real-time data from the standby database
Monitoring Redo Apply
While Redo Apply runs on the standby database, administrators need to monitor the status of the apply process and check if it's working in accordance with the selected configuration. As mentioned, the MRP process is responsible from the Redo Apply process and monitoring the status of this process will give us valuable information on what's going on with Redo Aapply.Time for action – monitoring Redo Apply
We'll install Data Guard configuration later. So, you will not be able to perform the actions in this article on the test environment. Please just read the actions to consolidate the given theoretical information mentioned earlier.We'll query the v$managed_standby view on the standby database for monitoring. The Data Guard configuration is in the Maximum Performance mode with ASYNC and LGWR attributes. We'll change the redo transport and apply characteristic and monitor the behavior of Data Guard.
- For our first test, a one hour delay is defined. Let's check this by running the following query on the primary database:
We can see that a 60-minute delay is defined on the primary database. This doesn't mean that the redo data will be sent with a 60-minute delay. This setting means the redo data will be sent immediately but the standby database will not apply the redo that was received in the last 60 minutes.SQL> select name, value from v$parameter where name like 'log_archive_dest_2'; NAME VALUE ------------------- ---------------------------------------- log_archive_dest_2 SERVICE=TEST_STANDBY LGWR ASYNCVALID_FOR= DELAY=60 (ONLINE_LOGFILES,PRIMARY_ROLE)DB_UNIQUE_NAME=TEST - So let's see what's happening on the standby side by running the following query on the standby database. (Note: We can connect to a standby database from the standby database server with the sqlplus / as sysdba command. This allows us to connect to the database as a sys user and with password file authentication.)SQL> select process, status, thread#, sequence#, block#, blocksfrom v$managed_standby; PROCESS STATUS THREAD# SEQUENCE# BLOCK# BLOCKS --------- ------------ ---------- ---------- ---------- ---------- ARCH CONNECTED 0 0 0 0 ARCH CONNECTED 0 0 0 0 MRP0 WAIT_FOR_LOG 1 461 0 0 RFS IDLE 0 0 0 0 RFS IDLE 1 469 1727085 40
- The output shows that the log with the sequence 469 is being received from primary, but the MRP process is still waiting for the log with the sequence number 461. Let's check if this log has been received:SQL> select name, archived from v$archived_log wheresequence#=461; NAME ARC ----------------------------------------------------------- -- +FRA/test/archivelog/2012_08_08/thread_1_seq_461.2606.7908 YES
- So the log sequence 461 was received but MRP is not applying it because of the configured 60-minute delay on the primary database. We can see this situation more clearly on the alert log:
The highlighted line in the previous code shows that the log sequence 461 was received at 22:31 but will be available to use only after 60 minutes.RFS[1]: Archived Log: '+FRA/test/archivelog/2012_08_08/thread_1_seq_461. 2606.790810199' Wed Aug 8 22:31:28 2012 RFS[1]: Archive log thread 1 sequence 461 available in 60 minute(s) Wed Aug 8 23:14:48 2012 Media Recovery Log +FRA/test/archivelog/2012_08_08/thread_1_seq_460.2841 .790809291 Media Recovery Delayed for 60 minute(s) - Now let's cancel the delay on the media recovery and monitor again. On the primary database perform the following:SQL> alter system set log_archive_dest_2='SERVICE = TEST_STANDBYLGWR ASYNC VALID_FOR = (ONLINE_LOGFILES,PRIMARY_ROLE)DB_UNIQUE_NAME = TEST'; System altered.
- After a few minutes on the standby database perform the following:
We can see that, the MRP is not waiting for any old sequence; it's waiting for the log sequence that is on the way from primary to standby. (Because the LGWR attribute is used on log transport, this log is the current log sequence on the primary.)SQL> select process, status, thread#, sequence#, block#, blocksfrom v$managed_standby; PROCESS STATUS THREAD# SEQUENCE# BLOCK# BLOCKS --------- ------------ ---------- ---------- ---------- ------ ARCH CONNECTED 0 0 0 0 ARCH CLOSING 1 470 3432448 403 MRP0 WAIT_FOR_LOG 1 471 0 0 RFS IDLE 0 0 0 0 RFS IDLE 1 471 878728 2 - Let's look at the alert log again:
As you can see there's no text in alert log about the delay, because it was cancelled. The MRP process applied the log sequence 470 and started to wait for the next log (471) to completely arrive and get archived. It also indicates that the next log is in transit, which means it is currently being received by RFS.Thu Aug 09 00:27:16 2012 Media Recovery Log +FRA/test/archivelog/2012_08_09/thread_1_seq_470.515 .790820745 Thu Aug 09 00:27:57 2012 Media Recovery Waiting for thread 1 sequence 471 (in transit) - Let's convert the Redo Apply mode to real-time apply and see how Data Guard will apply the redo as it received from the primary database. First we'll stop Redo Apply on the standby database and start again in the real-time apply mode:SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL; Database altered. SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USINGCURRENT LOGFILE DISCONNECT FROM SESSION; Database altered.
- After a few minutes we will check the status of the processes:SQL> select process, status, thread#, sequence#, block#, blocksfromv$managed_standby; PROCESS STATUS THREAD# SEQUENCE# BLOCK# BLOCKS --------- ------------ ---------- -------- --------- ------- ARCH CONNECTED 0 0 0 0 ARCH CLOSING 1 472 3432448 403 MRP0 APPLYING_LOG 1 473 1985328 4096000 RFS IDLE 0 0 0 0 RFS IDLE 1 473 1985957 11
You should also know that, even there is a DELAY value specified on the primary database; if the apply mode is real-time apply on the standby database, the DELAY will be ignored. You'll see the following lines in the standby alert log in such a case:
Managed standby recovery started with USING CURRENT LOGFILE Ignoring previously specified DELAY 60 minutes
What just happened?
You have just seen the Redo Apply behavior on different Data Guard configurations such as delayed, non-delayed, and real-time apply. You learned how to query the status of the important Data Guard processes MRP and RFS on the standby database.SQL Apply (logical standby databases)
The SQL Apply technology resides on mining the standby redo logs, building SQL transactions that apply the changes in question, and finally, executing the SQL on the standby database, which is read/write accessible. This process is more expensive in terms of hardware resource usage as a matter of course. The LSP process manages the application of changes to a logical standby database.The general purpose of building a logical standby database is reporting the needs with read/write access requirement. SQL Apply is not suitable for disaster recovery and high availability as much as Redo Apply because of the unsupported data types and logically different database infrastructure.
SQL Apply offers the following benefits to its users:
- The logical standby database is always read/write accessible while SQL Apply is running; so that users may run reports, create temporary tables and indexes for performance issues. Also it's possible to create objects and keep data on the standby database, which do not exist on primary.
- The logical standby database is open for read/write activity. But normally there are no writes possible on the standby objects, which exist on primary. This feature maintains the consistency of the replicated primary data.
- It's possible to upgrade the Oracle database software version with almost no downtime using a logical standby database.
Role transitions
Role transitions basically enable users to change the roles of the databases in a Data Guard configuration. There are two role transition options in Data Guard, which are switchover and failover .Switchover
In a basic Data Guard configuration with one primary and one standby database, a switchover operation changes the roles of these databases, and so the direction of the redo shipping. In a correctly designed configuration, archived log shipping in the opposite direction starts immediately after switchover and clients do not need to change their connection descriptions in order to connect the new primary database.If there is more than one standby database in a Data Guard configuration, it's possible to perform switchover between the primary and any of the standby databases. After the switchover, the new primary database can continue to send redo to all of the standby databases in the configuration.
Regardless of the configuration of Data Guard, a switchover operation always guarantees zero data loss. This brings high reliability to switchover and thus it's widely used for planned maintenance operations, such as hardware or operating system upgrades, database software rolling upgrade, and other infrastructure maintenances. Switchover reduces the downtime for these maintenance operations by a significant amount of time.
Failover
Failover is the operation of converting a standby database to a primary database, because of a failure in the original primary database. If the flashback database is disabled on the primary database, failover is an operation with no return. In other words, we have to flashback the failed primary database to a state before failover in order to re-establish the configuration. Without flashback, Data Guard configuration needs to be built from scratch.A manual database failover may be performed in the case of failure with the initiative of the database owner. However, this will require extra outage for the decision making. If fast-start failover is used, which is a 10 g release 2 feature, the failover operation will perform automatically.
Fast-start failover
This property of automating the failover operation can only be used in Data Guard broker enabled configuration. The observer process which runs on a different server from the primary and standby databases, continuously monitors the accessibility of the primary database. If both the observer and the standby database cannot reach the primary database for a predefined length of time, a fully-automated failover process is started. With 11 g Release 2, we call it fully automated, because this process includes changing the role of the standby as primary, starting the database services on the new primary database, disconnecting the client from the failed primary database, and redirecting them to the new primary database.If the observer establishes the connection with the original primary database again after the failover, it informs the database that the failover was performed and it will automatically reinstate the database using flashback. In order to configure fast-start failover, we need to specify the fast recovery area and enable flashback on the primary and standby databases.
Keep in mind that in Version 11 g , Data Guard must be on Maximum Availability or Maximum Performance mode in order to use fast-start failover. In 10 g Release 2, only Maximum Availability mode is supported for fast-start failover.
User interfaces for administering Data Guard
There are three options for a database administrator to manage a Data Guard environment, which are SQL*Plus command-line interface, Oracle Enterprise Manager, and Data Guard broker command-line interface (DGMGRL). In almost every IT infrastructure management interface, command-line tools offer great flexibility and detailed options and the graphical interfaces are user friendly, simple, and automated.SQL*Plus
SQL*Plus provides all kinds of administration and monitoring operations for the administrators, but you'll need to access each server in the Data Guard configuration and do the operations separately. It's also sometimes painful to have easy readable outputs from SQL*Plus.DGMGRL
Data Guard broker command-line interface (DGMGRL) is the Data Guard broker tool that automates and centralizes Data Guard management. Using DGMGRL we can run some consecutive operations such as switchover and failover with just one command. Also, the status of the Data Guard configuration can be queried with special Data Guard broker commands via DGMGRL. Outputs are designed to be easily readable.Enterprise Manager
Enterprise Manager offers an integrated graphical user interface for Data Guard broker enabled Data Guard configurations. It's possible to graphically monitor the general configuration information, performance, synchronization status of Data Guard, and also perform administration tasks such as switchover, failover, adding, and removing standby database from configuration.Time for action – using interfaces to monitor Data Guard
- At the first step we will use SQL*Plus to gather information from Data Guard and monitor its status. The connection to the standby database must be from the standby database server with password file authentication if the standby database is on mount mode and so not accessible from outside. If Active Data Guard is enabled, it's also possible to connect a standby database remotely. Let's connect to the standby database and gather the main Data Guard configuration information:
We have a physical standby database with the Maximum Performance mode. The value of the OPEN_MODE column is READ ONLY WITH APPLY, which indicates that Active Data Guard is enabled. The output of the second query shows that real-time apply is being used as the recovery mode.$sqlplus / as sysdba SQL> select database_role,open_mode,protection_mode from v$database; DATABASE_ROLE OPEN_MODE PROTECTION_MODE ---------------- -------------------- -------------------- PHYSICAL STANDBY READ ONLY WITH APPLY MAXIMUM PERFORMANCE SQL> select recovery_mode from v$archive_dest_status where recovery_mode ! = 'IDLE'; RECOVERY_MODE ----------------------- MANAGED REAL TIME APPLY - Now let's check the status of the Data Guard synchronization:
The output shows that we have a fully synchronized standby database, where there is no redo transport and apply lag. The estimated startup time value is 231 seconds, which is an estimate of the time needed to start and open the standby database.SQL> select name, value from v$dataguard_stats; NAME VALUE ------------------------- --------------- transport lag +00 00:00:00 apply lag +00 00:00:00 apply finish time estimated startup time 231 - Now we'll see an example about how to use Data Guard broker command-line interface (DGMGRL) to gather information about the Data Guard status. We can run DGMGRL on the primary database server and connect locally or we can also connect from a remote server. Let's connect from the primary database server locally:$dgmgrl DGMGRL> connect sys/password; Connected. We have connected to the primary database with the sys user. Now we can check the configuration. DGMGRL> show configuration; Configuration - TEST Protection Mode: MaxPerformance Databases: Turkey - Primary database India - Physical standby database Fast-Start Failover: DISABLED Configuration Status: SUCCESS
- We had the general configuration information with the show configuration command. At the end of the output we see the configuration status as SUCCESS, which means, everything in the broker configuration is working properly. However, we can also see a status of warning or error. We can also run the show database command for some general information:DGMGRL> show database 'India'; Database Name: India Role: PHYSICAL STANDBY Enabled: YES Intended State: ONLINE Instance(s): india Current status for "India": SUCCESSIn order to gather detailed information from the databases in the Data Guard configuration, we use the keyword verbose in the show database command such as show database verbose 'India'.
- The last interface to monitor and manage a Data Guard configuration is the Enterprise Manager Cloud Control, with the former name Enterprise Manager Grid Control. The following screenshot shows the interface for the monitoring and administration of Data Guard.
What just happened?
You have just seen examples of monitoring the Data Guard environment with three different interfaces. These examples are just intended to give you a first impression of what these interfaces look like.All of these interfaces can be used to monitor and manage the Data Guard; however, they all have their own pros and cons. If you already use Enterprise Manager Cloud Control in your current IT infrastructure, Data Guard installations must be added as targets in order to take advantage of its visual and easy monitoring and management potential. If you don't have Cloud Control but have multiple Data Guard installations, you should think about using it to overcome the challenges of central monitoring.
Data Guard background processes
In a Data Guard configuration we can see some Oracle Data Guard specific background processes in both, primary and standby databases. These processes perform the operations of redo transport and apply services. Data Guard broker also has some specific background processes. We can see the description and duties of the most important Data Guard processes as follows:- MRP0 ( Managed Standby Recovery Process ) coordinates the read and apply process of redo in a physical standby database.
- RFS ( Remote File Server ) is responsible for receiving the redo data, which is sent by the primary database to the standby database.
- LSP0 ( Logical Standby Coordinator Process ) coordinates the SQL Apply processes, which are the mining processes and apply processes.
- LSP1 ( Logical Standby Dictionary Build Process ) is used on the logical standby databases when a switchover or failover is in action.
- LSP2 ( Logical Standby Set Guard Process ) is used to operate Database Guard settings. Database Guard specifies which objects will be protected for modification in a logical standby database.
- NSAn ( Redo Transport NSA1 Process ) is used on the primary database to ship redo data to the standby database when ASYNC mode is being used. There may be multiple NSA processes such as NSA1 and NSA2.
- NSSn ( Redo Transport NSA1 Process ) is also used on the primary database to ship redo data to the standby database. However, only when the SYNC mode is being used.
- DMON ( Data Guard Broker Monitor Process ) runs on every instance in a Data Guard broker configuration. It communicates with local database and DMON processes of the remote databases. The broker-related requests and the monitoring information are transferred on this communication channel.
- FSFP ( Data Guard broker fast-start failover pinger process ) is used for the management of fast-start failover status.
Other replication solutions and Data Guard
There are many options to replicate an Oracle database data to a remote system. In the scope of disaster recovery, Oracle Data Guard and storage-based replication solutions such as EMC Symmetrix Remote Data Facility (SRDF), HP Continuous Access, Hitachi Universal Replicator and TrueCopy, IBM Global Mirror, and Metro Mirror are the main players in the market. When talking about Oracle database replication we also have to mention Oracle's well-known replication technologies GoldenGate and Streams. However, these products were not developed for disaster recovery fundamentally. Their primary aim is replication for ETL and data warehouse.There are also some third-party tools capable of replicating Oracle database data, but here we'll mention about the most commonly-used technologies: Data Guard, storage-based replication solutions, GoldenGate, and Streams.
Storage-based replication solutions
Storage-base replication solutions technologies are based upon the storage-array based replication of data. Thus, the source of data does not matter. All kinds of application and database data can be replicated to a remote location, where Data Guard is only able to replicate Oracle databases.In general there are two kinds of storage-based replication: synchronous and asynchronous replication. Synchronous replication means that each update to the source storage unit must also be updated in the target storage unit before another update can process. This guarantees zero data loss in the case of primary site failure. However, synchronous replication affects the I/O respond performance of the primary system depending on the distance between sites and network capacity. Therefore, this technology is distance limited. Synchronous replication technologies support up to 300 km distance between sites in the current technology level.
Asynchronous replication provides a long-distance replication solution with minimal impact on performance. In some products, the main problem with the asynchronous mode is the data consistency on the secondary site. The primary site sends a periodic, incremental copy of updates to the secondary site instead of a constant stream of updates. So there is no guarantee that dependent write operations on the primary site are transferred and applied to the remote destination in the same sequence.
Using storage-based replication solutions, it's not possible to start an Oracle instance and query database on the secondary site using the disks with the replicated data because of the data inconsistency issue. However Data Guard offers Active Data Guard, which enables users to query the standby database while replication is on the go. Some other advantages of Data Guard over storage-based replication solutions are enhanced corruption detection and prevention, automated database failover (fast-start failover), and RMAN backup offloading features that may not benefit from the use of storage-based replication solutions.
GoldenGate and Streams
GoldenGate is a data replication and integration tool for heterogeneous environments. It provides real-time capture, transformation, routing, and delivery of database transactions across heterogeneous systems (Oracle, DB2, MySQL, SQL Server, Sybase, Teradata, Netezza, and so on). Oracle agreed to acquire GoldenGate software in 2009 and then released 10.4, 11.1, and 11.2 versions with new enhancements. On the other hand, Streams is a built-in feature of the Oracle database that was first announced with database Version 9.2 and allows information sharing within an Oracle database or between Oracle databases.Their common property is their capability of capturing, propagating, and applying data changes between Oracle databases.
On the other hand their main differences are:
- The heterogeneous platforms and data integration support of GoldenGate is different from that of Streams
- License conditions for Streams is included in the Oracle Enterprise Edition license and GoldenGate is a self-licensed product
Oracle recommends Data Guard for full Oracle database protection with the high availability and disaster recovery purpose and recommends GoldenGate for information distribution and consolidation, application upgrades, changes, and also applications desiring flexible high availability needs.
An important feature of GoldenGate that makes the product different from its counterparts is the bidirectional replication capability, which is also called active-active replication. With this feature the primary and standby concepts are replaced by two active primary sites. Updates on site A are replicated to site B, and updates on site B are replicated to site A. The main challenges here are conflict handling and loop detection. A conflict is likely to occur in a bi-directional environment, where the same row or field is being updated at both sides and the changes are replicated. In this situation, a decision needs to be made if both transactions fail, or one transaction overwrites the other. The other key point is loop detection. If an update is replicated from site A to site B and then the same update from site B to site A, and so on, this loop needs to be detected and solved. The following diagram shows the general structure of an active-active GoldenGate configuration:The replication market's leaders, namely, Data Guard, storage-based replication products, and GoldenGate are compared in the following table with their most important features. Streams is out of this comparison because of the strategy mentioned by Oracle on its replication products:
Data
Guard
|
Storage-based
replication
|
GoldenGate
|
|
Hardware
independency
|
Supported.
Possible to choose different server/storage vendors for primary and standby.
|
Not
Supported. Must use the same storage vendor on both sides.
|
Supported.
Possible to choose different server/storage vendors for primary and standby.
|
Software
independency
|
Not
supported. Only Oracle database replication.
|
Supported.
All kinds of database and application data can be replicated.
|
Limited
support. Different database products can be replicated.
|
Zero
data loss capability
|
Supported
with Maximum Protection mode.
|
Limited
support with synchronous replication (distance limitation about 300 km).
|
Not
supported.
|
Corruption
detection and prevention
|
Supported.
|
Not
supported.
|
Not
supported.
|
Bidirectional
replication within one database
|
Not
supported.
|
Not
supported.
|
Supported.
Two active sites may send updates to each other.
|
Query
standby data
|
Supported
with Active Data Guard and Snapshot standby features.
|
Not
supported.
|
Supported
with continuously read/write accessible target databases.
|
Inside
database selective replication
|
Limited
support with logical standby databases.
|
Not
supported.
|
Supported.
Data may be selected and transformed before it hits the target.
|
Automatic
database failover
|
Supported
with fast-start failover feature.
|
Not
supported.
|
Not
supported.
|
GUI
based management
|
Supported.
|
Supported.
|
Supported.
|
RMAN
backup offload
|
Supported.
The primary database RMAN backups can be offloaded to a physical standby and
backups will physically be the same.
|
Not
supported.
|
Supported.
In a full replication of primary, RMAN backups may be offloaded but backups
will only logically be the same, not physically.
|
Cascaded
destinations for replication
|
Supported.
|
Supported.
|
Supported.
|
License
|
License
required only for Active Data Guard. Otherwise no extra license required.
|
License
required for storage replication software.
|
License
required for GoldenGate software.
|
Summary
You've reached the end of this article. We covered the definition, general properties, and history of Oracle Data Guard. It's very important to know the capabilities and general properties of similar products when implementing an IT solution. We have now gained an understanding of what Data Guard and the other main Oracle database replication products offer to its users. We're able to make decisions for the implementation of our replication requirements.-- Above content has been taken from Book :
No comments:
Post a Comment