OCI Compute Instance metrics / OCI Monitoring services (with video)

Português/English

Hello everyone.

In my studies for the OCI Architect Professional Certification, I found a question in the official Practice Test whose answer is worth demonstrating, and that’s what I’m gonna do here.

The question is:

Screen Shot 2020-06-22 at 9.49.07 AM

The theme is OCI Compute Instance metrics and what is necessary for their metrics to be available for the OCI Monitoring services and other services that depend on it (Instance Pool and Autoscaling for example).

I confess I picked the right answer by elimination, because none of the other three answers are feasible, but I didn’t remember to have read about the need of a Service Gateway for such cases. You don’t need to have a Public IP for the instance to propagate metrics (A and C), and Autoscaling does work perfectly with multiple ADs (B).

So, let me show why it is really necessary to have a Service Gateway in order for a compute instance in a private subnet to be able to propagate its metrics to the Monitoring services.

According to the documentation (as of today, because OCI is constantly improving, so always check it to see if things still work the same way), in order for a Compute Instance to propagate its metrics to Monitoring Services you need to have:

  1. A supported image, which will have the monitoring agent installed (you can install the agent manually if you have an existing instance with a legacy image); AND
  2. A public IP address OR a Service Gateway.

As monitoring is optional for compute instances, of course you also need to enable this resource if you want the metrics to be monitored.

Well, the documentation has the answer, but let me show this in a practical example.

And, by the way, I will demonstrate that the documentation is not 100% accurate in its statement about that. I will show that the metrics will work for a compute instance with only private IP if it has access to the Internet, through a NAT Gateway for example.

In order to demonstrate my point, I will create the following very simple architecture:

Screen Shot 2020-06-22 at 11.10.23 AM

I could create an Instance Pool and Autoscaling configuration to demonstrate the complete scenario created by the question we are analyzing, but I don’t think it’s necessary. The instances in an Instance Pool and the associated Autoscaling would work the same way as my unique private instance in terms of propagating metrics.

If you want to create the autoscaling to go further, you can follow the steps in my previous post / presentation here.

I won’t loose time also configuring Security Lists to be perfect in terms of security, because it’s not what we are studying here. So I’ll just enable all the ingress and egress traffic for both subnets. So, in order to follow the process here, be sure to configure the Route Tables and Security Lists accordingly.

All my instances will have the latest image and will have monitoring enabled in order to follow the first requisite for the monitoring to work.

So, let’s start.

1. Create and configure the VCN, subnets, Route tables, Security Lists, and Compute instances

I assume that you have already enough knowledge (at least from my previous posts) to create and configure these items accordingly. Anyway, if you don’t, you can check now my previous posts and/or follow the video I am providing here (in Portuguese).

So, at this point I assume you have all this items created as show in the images below:

VCN and subnets:

Screen Shot 2020-06-22 at 10.39.24 AM

Subnets Route Tables and Security Lists associations:

Screen Shot 2020-06-22 at 10.38.20 AM

Screen Shot 2020-06-22 at 10.38.44 AM

Security List rules:

Screen Shot 2020-06-22 at 10.40.41 AM

Screen Shot 2020-06-22 at 10.40.48 AM

Route tables:

Screen Shot 2020-06-22 at 10.41.49 AM

Screen Shot 2020-06-22 at 10.43.10 AM

Compute instances:

Don’t forget to upload you SSH key when creating the instances.

I am choosing the suggested image (Oracle always suggest the latest Oracle Linux image, so it will support the monitoring agent), and the monitoring is enabled by default as shown here:

Screen Shot 2020-06-22 at 10.55.03 AM

Here are the instances created:

Screen Shot 2020-06-22 at 11.02.51 AM

2. Check metrics and monitoring

Wait a few minutes after the instances were created, and check if they are collecting metrics and also if the metrics are being sent to the Monitoring services.

So, first look at metrics from the instances’ console page:

Pub1 instance:

Screen Shot 2020-06-22 at 11.13.16 AM

Priv1 instance:

Screen Shot 2020-06-22 at 11.12.24 AM

Then, go to Monitoring => Service Metrics and select the oci_computeagent namespace. You should see the metrics from both instances in the same chart:

Screen Shot 2020-06-22 at 11.14.07 AM

The charts confirm that the instance agent is collecting metrics and it is able to send the metrics thru the Internet to the Monitoring Services endpoint.

You see? Despite I don’t have a public IP for my Priv1 instance nor a Service Gateway configured, the instance is still able to send its metrics to the Monitoring Services because it has the NAT Gateway as a path to the Internet.

3. Remove NAT Gateway route

Now let’s remove the route to the NAT Gateway and we will see that Monitoring Services won’t receive Priv1 metrics anymore.

Just go to the PrivRT route table and remove the route rule that points to the NAT Gateway. Then, wait a few minutes more and check again the Instance and the Monitoring pages:

Screen Shot 2020-06-22 at 11.32.42 AM

Screen Shot 2020-06-22 at 11.30.38 AM

You see? Despite the agent is still collecting metrics inside the Priv1 instance, they are not being propagated and this is because the agent cannot reach the Monitoring Services endpoint anymore.

As the Instance Pool and Autoscaling relies on the Monitoring Services to check metrics and decide to scale, it would NOT work with the current configuration, which is exactly what our question was asking.

4. Create a Service Gateway

Now let’s test the recommended configuration, in order to have a private instance propagating its metrics to the Monitoring Services via the private OCI network, not via Internet: let’s create a Service Gateway.

So, come back to your VCN and create a Service Gateway. When creating, you have two possible endpoints to choose from: one specific for Object Storage, and another for all the other OCI services. Let’s pick the latter:

Screen Shot 2020-06-22 at 11.41.49 AM

Then, let’s create a route rule to point to the Service Gateway. Let’s go to our PrivRT route table (the one associated with the private subnet and had the NAT Gateway rule) and add the proper route rule:

Screen Shot 2020-06-22 at 11.45.05 AM

Wait a few more minutes, and come back to the Monitoring Services page and confirm that the metrics are back for the Priv1 instance:

Screen Shot 2020-06-22 at 11.55.00 AM

And that’s it. We were able to proof that, in order to have the metrics of a compute instance available to the Monitoring Services (and thus to the other services that rely on it, like Instance Pools / Autoscaling), you need:

  • A compatible image that has the OCI agent installed (or allow it to be installed manually).
  • A path to the Internet (either via public IP / Internet Gateway or private IP / NAT Gateway), OR a Service Gateway.

I hope you enjoy! See you soon!


OCI Compute Instance metrics / OCI Monitoring services (com vídeo)

Português/English

Olá a todos,

Nos meus estudos para a certificação OCI Architect Professional, encontrei uma pergunta no Practice Test oficial cuja resposta vale a pena demonstrar, e é isso que vou fazer aqui,

Esta é a questão:

Screen Shot 2020-06-22 at 9.49.07 AM

A pergunta é sobre o que é necessário para que as métricas de uma OCI Compute instance estejam disponíveis para os serviços de monitoração, e consequentemente para os outros serviços que dependem destas métricas (Instance Pool e Autoscaling por exemplo).

Confesso que acertei a pergunta por eliminação, porque nenhuma das outras três respostas faz sentido, mas eu não me lembrava de ter lido sobre a necessidade do Service Gateway para casos assim. Você não precisa obrigatoriamente de um IP público para a instância propagar métricas (A e C), e Autoscaling funciona perfeitamente com múltiplos ADs (B).

Então, deixe-me mostrar porque é realmente necessário ter o Service Gateway para que uma compute instance em uma subnet privada possa propagar suas métricas para o Monitoring services.

De acordo com a documentação (como está hoje, porque o OCI está evoluindo constantemente, então sempre verifique se as coisas ainda funcionam da mesma maneira), para que uma Compute Instance propague suas métricas para o Monitoring Services você precisa:

  1. Uma imagem suportada, que tem o agente de monitoração instalado (você pode instalá-lo manualmente em instâncias com imagens antigas); E
  2. Um IP público OU um Service Gateway.

Como a monitoração é opcional nas compute instances, obviamente você precisa ligar este recurso também.

Bom, a documentação tem a resposta, mas vou demonstrar isto em um exemplo prático.

E, a propósito, vou demonstrar também que a documentação não está 100% correta na sua afirmação sobre isto. Vou mostrar que as métricas também funcionam para uma Compute Instance com apenas IP privado que tenha acesso à Internet, através de um NAT Gateway por exemplo.

Para minha demonstração eu vou criar a seguinte estrutura bem simples:

Screen Shot 2020-06-22 at 11.10.23 AM

Eu poderia criar também Uma Instance Pool e Autoscaling para demonstrar o cenário completo criado pela pergunta que estamos analisando, mas não acho que seja necessário. As instâncias na Instance Pool iriam funcionam exatamente da mesma maneira que a minha única instância privada em termos de propagação de métricas.

Se você deseja criar o Autoscaling e ir além no test, você pode seguir os passos do meu post / apresentação anterior aqui.

Eu não vou perder tempo configurando Security Lists para serem perfeitas em termos de segurança, porqu não é o foco de estudo aqui. Vou apenas liberar todo o tráfego de entrada e saída para ambas as subnets. Então, para seguir o processo aqui, tenha certeza de configurar as Route Tables e Security Lists adequadamente.

Todas as minhas instâncias terão a imagem mais recente e terão o monitoramento ligado para atender ao primeiro requisito mencionado acima.

Vamos começar.

1. Crie e configure a VCN, subnets, Route tables, Security Lists, e Compute instances

Estou assumindo que voce já tenha conhecimento suficiente (pelo menos pelos meus posts anteriores) para criar e configurar os itens básicos adequadamente. De qualquer maneira, se não é o seu caso, você pode ver meus posts anteriores e/ou seguir o vídeo que estou disponibilizando junto com este post (em Português).

Então, neste momento assumo que você criou todos os itens como mostrado nas imagens abaixo:

VCN e subnets:

Screen Shot 2020-06-22 at 10.39.24 AM

Subnets e suas ligações com Route Tables e Security List:

Screen Shot 2020-06-22 at 10.38.20 AM

Screen Shot 2020-06-22 at 10.38.44 AM

Regras na Security List:

Screen Shot 2020-06-22 at 10.40.41 AM

Screen Shot 2020-06-22 at 10.40.48 AM

Route tables:

Screen Shot 2020-06-22 at 10.41.49 AM

Screen Shot 2020-06-22 at 10.43.10 AM

Compute instances:

Não se esqueça de fazer o upload da sua chave SSH quando criar as instâncias.

Estou escolhendo a imagem sugerida (a Oracle sempre sugere a imagem com Oracle Linux mais recente, então ela terá o agente de monitoração instalado), e a monitoração é ligada por padrão como mostrado aqui:

Screen Shot 2020-06-22 at 10.55.03 AM

Aqui estão as instâncias criadas:

Screen Shot 2020-06-22 at 11.02.51 AM

2. Verifique as métricas e monitoração

Espere alguns minutos depois de criar as instâncias, e verifique se as métricas estão sendo coletadas e aparecem no Monitoring services.

Primeiro, olhe as métricas na página da instância:

Pub1 instance:

Screen Shot 2020-06-22 at 11.13.16 AM

Priv1 instance:

Screen Shot 2020-06-22 at 11.12.24 AM

Agora, vá para Monitoring => Service Metrics e selecione o namespace oci_computeagent namespace. Você deve ver as métricas das duas instâncias no mesmo gráfico:

Screen Shot 2020-06-22 at 11.14.07 AM

Os gráficos acima confirma que o agente as instâncias está coletando as métricas e consegue enviá-las via Internet para o endpoint do Monitoring Services.

Percebeu? Apesar de que a minha instância Priv1 não tem um IP público e nem um Service Gateway configurado, ela ainda assim consegue enviar as métricas para o Monitoring Services porque há o caminho para a Internet através do NAT Gateway.

3. Remova a rota para o NAT Gateway

Agora vamos remover a rota para o NAT Gateway e veremos que o Monitoring Services não vai mais receber as métricas da instância Priv1.

Vá para a route table PrivRT e remova a rota que aponta para o NAT Gateway. Então aguarde mais alguns minutos e verifique novamente as páginas de métricas:

Screen Shot 2020-06-22 at 11.32.42 AM

Screen Shot 2020-06-22 at 11.30.38 AM

Percebeu? Apesar de o agente ainda estar coletando as métricas dentro da instância  Priv1, elas não são mais propagadas e isso é porque o agente não consegue mais se comunicar com o Monitoring Services.

Como os serviços de Instance Pool e Autoscaling dependem do Monitoring Services para verificar as métricas pra escalar, eles NÃO funcionariam com a configuração atual, que é exatamente o que a nossa questão está perguntando.

4. Crie um Service Gateway

Agora vamos testar a configuração recomendada, para ter a instância privada propagando suas métricas para o Monitoring Services via rede privada da OCI, e não pela Internet: vamos criar um Service Gateway.

Volte para sua VCN e crie um Service Gateway. Ao criar você tem duas opções de endpoint: uma específica para o Object Storage, e outra para todos os outros serviços da OCI. Vamos escolher a segunda:

Screen Shot 2020-06-22 at 11.41.49 AM

Agora, vamos criar uma regra de roteamento para o Service Gateway. Vamos para a nossa route table PrivRT (a que está associada com a subnet privada e que tinha a regra do NAT anteriormente) e adicionar a regra:

Screen Shot 2020-06-22 at 11.45.05 AM

Espere mais alguns minutos, e volte ao Monitoring Services para confirmar que as métricas voltaram a aparecer para a instância Priv1:

Screen Shot 2020-06-22 at 11.55.00 AM

E é isto. Pudemos comprovar que, para ter as métricas de uma compute instance disponíveis para o Monitoring Services (e consequentemente para os outros serviços que dependem dele como o Instance Pools / Autoscaling), você precisa:

  • Uma imagem compatível que tenha o agente do OCI instalado (ou que permita uma instalação manual).
  • Um caminho para a Internet (seja via IP público / Internet Gateway or IP privado / NAT Gateway), OU um Service Gateway.

Espero que gostem! Nos vemos em breve!

 

Creating an OCI environment from scratch (video in Portuguese)

Português/English

Hi.

Yesterday I had the honor to present a Live video at GUOB (Grupo de Usuários Oracle no Brasil) Youtube channel showing a step-by-step process on how to create an OCI environment from scratch. I started from an empty tenancy, and went over all the steps creating VCNs, Subnets, Compute Instances, Load Balancer, an Autonomous Database, a File Storage Service and connecting all the pieces with a simple web application. Then to finish I used an Autoscaling configuration to show the automatic scale-out of web servers.

Here is the architecture I created during the presentation:

Screen Shot 2020-06-11 at 10.38.52 AM

The files will be made available probably today on the GUOB website.

I hope you enjoy! See you soon!


Criando um ambiente OCI do Zero (vídeo em Poruguês)

Português/English

Olá

Ontem eu tive a honra de apresentar uma “Live” no canal do GUOB (Grupo de Usuários Oracle no Brasil) no Youtube mostrando passo a passo como criar um ambiente OCI do zero. Eu comecei com uma tenancy vazia, e fui passando por todos os passos, criando VCN, Subnets, Compute Instances, Load Balancer, um Autonomous Database, um File Storage Service e conectando todas as peças com uma aplicação web simples. Para terminar, usei uma configuração de Autoscaling configuration para mostrar a criação automática de novos web servers com o Autoscaling.

Aqui está a arquitetura criada durante a apresentação:

Screen Shot 2020-06-11 at 10.38.52 AM

Os arquivos serão disponibilizados provavelmente ainda hoje no website do GUOB.

Espero que gostem! Nos vemos em breve!

 

OCI Transit Routing (with video)

Português/English

Hi, this is the third post of a series where I explore the network connectivity between OCI and the exterior world using IPSec VPN. In the first post, I created two sites using Virtualbox and connected them using just Openswan, so not exactly OCI-related. In the second one, I used AWS with Openswan in one side, and OCI IPSec VPN in the other, thus connecting the two clouds. Now, what I’m going to do here is to use the IPSec VPN Tunnels and almost all the configuration created in the previous post, to not only connect to one VCN in OCI, but several, using OCI Transit Routing. This concept is explained in the OCI documentation here.

This tutorial is available in video as well here, in Portuguese only, to help Oracle community in Brazil.

I hope you enjoy.

Here is architecture I’m going to build:

Screen Shot 2020-06-03 at 3.04.37 PM

Please note that the AWS side is exactly the same as we build in the previous post. On OCI, we have the same VCN_OS already created there, with its public Subnet, but now we will have two other Spoke VCNs. VCN_OS will act as our Hub VCN that will transit the traffic between AWS and the Spoke VCNs in both directions.

We will need to create some Local Peering Gateways as well, in order to connect the VCNs.

NOTE: If any command does not work due to privileges, it’s probably because you are not executing as root. For this to work, just put a “sudo” before the command, or issue “sudo su” to become root and then you can execute anything.

So, let’s start the process (or continue it).

1. Repeat initial steps to create the VPN Tunnel

All AWS steps and initial steps in OCI are the same as in my previous post Site-to-site IPSec VPN between AWS and OCI (with video).

I’m going to repeat here the steps starting at #13, when we create the CPE and IPSec Connection. So, I am skipping steps 1-12 and considering you have all that steps done correctly before continuing. If you never followed that steps, please do it before continuing.

If your VPN tunnels are still up and running OK, then you can safely skip directly to the step 8 of this post. Otherwise, if the tunnels are not working for any reason, just DELETE/TERMINATE the CPE and IPSec Connection and start here.

Make sure you write down all the IPs and secret keys. In my case I have at this moment:

Openswan Public IP: 18.220.224.1
Openswan Private IP: 10.1.0.41

Server1 Public IP: 3.128.34.39
Server1 Private IP: 10.1.0.199

Screen Shot 2020-06-03 at 10.08.08 AM

Screen Shot 2020-06-03 at 10.08.18 AM

2. Create a Customer Premise Equipment (CPE) and an IPSec Connection

Create the CPE with the name CPE_OS and associate it to the public IP of the Openswan in AWS (18.220.224.1). Select the vendor Libreswan (equivalent to Openswan) and the indicated version.

Screen Shot 2020-06-03 at 10.12.43 AM

Create the IPSec connection with the name IPSEC_OS, attached to the CPE and DRG previously created. Use a static route with CIDR 10.1.0.0/16 (which is the AWS VPC’s CIDR).

Screen Shot 2020-06-03 at 10.13.54 AM

Right after creating IPSec connection, you can already see the two redundant tunnels created:

Screen Shot 2020-06-03 at 10.15.48 AM

Write down their IPs and secret keys (to see the secret key, click on the tunnel and then Show beside Shared Secret), we will need them to configure Openswan:

Tunnel1 Public IP: 129.213.6.34
Tunnel1 Secret key: “2hE2oW8bfQXPZKlC84DUdf071V6KaXUYgsuiXlYk9oSGHYzsoYwQ8gvMJuPhpaTR”

Tunnel2 Public IP: 129.213.7.38
Tunnel2 Secret key: “7CcJXYUiKOayYB7c4p9czUQcLcFbtOFohA0fxvR79p8VnqOy8lrxjUIPKgnACVoz”

After a while, you will see both tunnels as available, but with the IPSec status still Down. This status will change to Up only when both sides are configured properly.

3. Configure IPSEC in Openswan VM

Now that you have all the IPs and resources required to configure the tunnels, you can come back to you Openswan VM in AWS and configure it.

Create or adjust the file /etc/ipsec.d/oci-ipsec.conf with the following content:

conn oracle-tunnel-1
	left=10.1.0.41
	leftid=18.220.224.1 # See preceding note about 1-1 NAT device
	right=129.213.6.34
	authby=secret
	leftsubnet=0.0.0.0/0 
	rightsubnet=0.0.0.0/0
	auto=start
	mark=5/0xffffffff # Needs to be unique across all tunnels
	vti-interface=vti1
	vti-routing=no
	ikev2=no # To use IKEv2, change to ikev2=insist
	ike=aes_cbc256-sha2_384;modp1536
	phase2alg=aes_gcm256;modp1536
	encapsulation=yes
	ikelifetime=28800s
	salifetime=3600s
conn oracle-tunnel-2
	left=10.1.0.41
	leftid=18.220.224.1 # See preceding note about 1-1 NAT device
	right=129.213.7.38
	authby=secret
	leftsubnet=0.0.0.0/0
	rightsubnet=0.0.0.0/0
	auto=start
	mark=6/0xffffffff # Needs to be unique across all tunnels
	vti-interface=vti2
	vti-routing=no
	ikev2=no # To use IKEv2, change to ikev2=insist
	ike=aes_cbc256-sha2_384;modp1536
	phase2alg=aes_gcm256;modp1536
	encapsulation=yes
	ikelifetime=28800s
	salifetime=3600s

Remember: left parameters refer to the local network (the AWS VPC_OS), and right parameters refer to the remote network (the OCI VCN_OS).

Make sure to adjust the following parameters according to your specific IPs:

left: Openswan private IP
leftif: Openswan public IP
right: tunnel1 and tunnel2 IPs (put one on each “conn” block)

ATTENTION: ensure that the indentation won’t be lost in the copy / paste process, otherwise you will have problems.

Screen Shot 2020-06-03 at 10.27.06 AM

4. Configure the secret keys file in Openswan VM

Create or adjust the file /etc/ipsec.d/oci-ipsec.secrets with the following content:

18.220.224.1 129.213.6.34: PSK “2hE2oW8bfQXPZKlC84DUdf071V6KaXUYgsuiXlYk9oSGHYzsoYwQ8gvMJuPhpaTR”
18.220.224.1 129.213.7.38: PSK “7CcJXYUiKOayYB7c4p9czUQcLcFbtOFohA0fxvR79p8VnqOy8lrxjUIPKgnACVoz”

Screen Shot 2020-06-03 at 10.29.29 AM

5. Restart IPSec in Openswan VM

sudo systemctl enable ipsec.service 
sudo systemctl restart ipsec.service
sudo systemctl status ipsec.service

Screen Shot 2020-06-03 at 10.34.27 AM

6. Check the tunnel status

If the tunnel is working, you should see two vti interfaces on the Openswan VM:

ifconfig | grep vti

Screen Shot 2020-06-03 at 10.35.01 AM

And let’s check if the connection between the two sides has been established:

sudo ipsec status | grep established

Note that Openswan shows the two tunnels are already active and with the connections established:

Screen Shot 2020-06-03 at 10.35.30 AM

If you do not get any results from the above command, even if you wait a few minutes, it is likely that something was wrong with the configuration.

After Openswan shows the connection has been established, wait a little longer and the OCI console should also show the tunnels are finally up:

Screen Shot 2020-06-03 at 10.36.46 AM

7. Add the proper routes in Openswan VM

Please note that we need to add the routes for all the networks we want to access in the target site:

ip route add 10.0.0.0/16 nexthop dev vti1 nexthop dev vti2
ip route add 10.2.0.0/16 nexthop dev vti1 nexthop dev vti2
ip route add 10.3.0.0/16 nexthop dev vti1 nexthop dev vti2

Remember that these routes won’t be persistent. You would need to re-add them every time your instance boot, or alternatively you can make them persistent by adding the following rows into /etc/sysconfig/network-scripts/route-eth0:

10.0.0.0/16 nexthop dev vti1 nexthop dev vti2
10.2.0.0/16 nexthop dev vti1 nexthop dev vti2
10.3.0.0/16 nexthop dev vti1 nexthop dev vti2

8. Create a Bastion instance in OCI

In our new architecture, the VCN_OS network (which will be our Hub VCN) will have only the Bastion host. We will be able to access it via SSH and from there reach all the other servers in OCI.

If you still have the server2 created in our previous architecture, just terminate it.

Let’s create the Bastion instance in the public subnet of our VCN_OS. Don’t forget to upload your public key so you can connect via SSH later. Remember that this Subnet is using the Route Table PubRT and Security List PubSL created in our previous post.

Screen Shot 2020-06-03 at 10.20.50 AM

When the instance is provisioned, write down its IPs as well:

Bastion Public IP: 193.122.166.22
Bastion Private IP: 10.0.0.3

9. Access Bastion via PING and SSH

ping 193.122.166.22
ssh -i ~/.ssh/ocikey opc@193.122.166.22

Screen Shot 2020-06-03 at 10.22.48 AM

10. Test ping between Server1 and Bastion

If your tunnels are up and you added the proper route in Openswan VM, the ping between Server1 (in AWS) and Bastion (in OCI) should work through their private IPs:

ssh -i ~/.ssh/awskey.pem ec2-user@3.128.34.39
ping 10.0.0.3

Screen Shot 2020-06-03 at 10.54.21 AM

ssh -i ~/.ssh/ocikey opc@193.122.166.22
ping 10.1.0.199

Screen Shot 2020-06-03 at 10.55.25 AM

11. Test SSH between Server1 and Bastion

Before testing SSH, we need to copy the private keys into the servers.

Let’s copy the OCI private key into Server1 on AWS, and the AWS private key into Bastion on OCI.

scp -i ~/.ssh/awskey.pem ~/.ssh/ocikey ec2-user@3.128.34.39:~/.ssh
scp -i ~/.ssh/ocikey ~/.ssh/awskey.pem opc@193.122.166.22:~/.ssh

Screen Shot 2020-06-03 at 11.00.49 AM

Now let’s try to connect to Server1, and from there do SSH to Bastion (using its private IP):

ssh -i ~/.ssh/awskey.pem ec2-user@3.128.34.39
# ssh -i ~/.ssh/ocikey opc@10.0.0.3

Screen Shot 2020-06-03 at 11.02.04 AM

And let’s do the opposite test as well, connecting to Bastion first and then doing SSH to Server1 (using its private IP):

ssh -i ~/.ssh/ocikey opc@193.122.166.22
# ssh -i ~/.ssh/awskey.pem ec2-user@10.1.0.199

Screen Shot 2020-06-03 at 11.04.48 AM

12. Create the two Spoke VCNs

Now, lets start creating new stuff. As you can see in our architecture design, we will have two more VCNs in OCI, the spoke VCNs that will take advantage of the IPSec VPN to communicate to the “on-premise” (actually AWS) site.

For these two VCNs we will leverage the default Route Table and default Security List of each one, in order to simplify the process.

Create the VCNs with the following parameters:

VCN Spoke1, CIDR 10.2.0.0/16, and a private subnet PrivSpoke1 with CIDR 10.2.0.0/24.

VCN Spoke2, CIDR 10.3.0.0/16, and a private subnet PrivSpoke2 with CIDR 10.3.0.0/24.

Screen Shot 2020-06-03 at 11.15.07 AM

Screen Shot 2020-06-03 at 11.15.37 AM

13. Create the server Serversp1 and Serversp2

Let’s now create two VM instances in OCI, named Serversp1 and Serversp2, inside Subnets PrivSpoke1 and PrivSpoke2 respectively. Please note these two instances will have only a private IP, so the only way to connect to them via SSH is to access the Bastion host first, and from there jump to Serversp1 or Serversp2. Don’t forget to upload your OCI key when creating the instances.

Screen Shot 2020-06-03 at 11.22.44 AM

Screen Shot 2020-06-03 at 11.22.58 AM

Let’s write down their IPs:

Serversp1 Private IP:

Serversp2 Private IP:

14. Peer the VCNs

In order for the VCNs to be able to communicate, we need to attach them by using Local Peering Gateways (LPG), because they are in the same region (if they were in different regions we would need to use DRG to establish a Remote Peering Connection).

So, we will need to create two LPGs within VCN_OS, and one within each Spoke1 and SPoke2 VCNs. I will name then LPG_OS_Spoke1, LPG_OS_Spoke2, LPG_Spoke1_OS and LPG_Spoke2_OS respectively.

After creating them, establish the Peering Connections properly, according to our design:

VCN_OS / LPG_OS_Spoke1 <===> Spoke1 / LPG_Spoke1_OS

VCN_OS / LPG_OS_Spoke2 <===> Spoke2 / LPG_Spoke2_OS

Please note that you can establish the peer from any side, he result will be the same.

Screen Shot 2020-06-03 at 11.33.51 AM

15. Create all the required routing rules

Well, the next two steps are maybe the trickiest ones. We need to create a lot of routing rules and route tables, not only in OCI, but adjust some things in AWS as well.

Instead of describing every rule we have to create, which would be very boring, I’m gonna put here a slide with each and every Route Table we will have, and the rules required in each one:

Screen Shot 2020-06-03 at 5.42.32 PM

15.1. Adjust the routing rules in AWS

Let’s start with the AWS part, in purple in the slide. You already have the PubRT_OS route table, so include the route rules for the two new VCNs (10.2 and 10.3).

Screen Shot 2020-06-03 at 3.25.27 PM

15.2. Create a Route Table for the DRG

Now, back to OCI, let’s create the Routing Table for the DRG named DRG_RT and with the following rules:

Screen Shot 2020-06-03 at 3.32.53 PM

15.3. Adjust the routing rules in the VCN_OS public subnet

Let’s proceed to the Route Table PubRT attached to the Public Subnet of VCN_OS. It already exists, but make sure to add the rules that will route traffic to the LPGs:

Screen Shot 2020-06-03 at 3.40.41 PM

15.4. Create the Route Table for the LPG_OS_Spoke1 in the VCN_OS

Now, still in the VCN_OS, create the Route Table RT_LPG_OS_Spoke1 with the following rules:

Screen Shot 2020-06-03 at 4.07.33 PM

15.5. Create the Route Table for the LPG_OS_Spoke2 in the VCN_OS

Do the same again to create the Route Table RT_LPG_OS_Spoke2 with the following rules:

Screen Shot 2020-06-03 at 4.10.34 PM

15.6. Attach the new Route Tables to the DRG and LPGs

First attach DRG_RT to your DRG: from within your VCN_OS, click Dynamic Routing Gateway, then click on the three dots beside the DRG and choose Associate Route Table:

Screen Shot 2020-06-03 at 3.36.29 PM

Then, do the same for both LPGs:

Screen Shot 2020-06-03 at 4.13.59 PM

15.7. Adjust the Spoke1 VCN Default Route Table

As you remember, I opted to use the default route table and security list for both spoke VCNs. So, adjust the default route table of the Spoke1 VCN to have the following rules:

Screen Shot 2020-06-03 at 5.48.03 PM

15.8. Adjust the Spoke2 VCN Default Route Table

Similarly, adjust the default route table of the Spoke2 VCN to have the following rules:

Screen Shot 2020-06-03 at 5.48.36 PM

16. Create all the required firewall (Security Lists / Groups) rules

I also prepared a slide to show all the Security Lists that need to be created or adjusted:

Screen Shot 2020-06-03 at 5.44.57 PM

Please note that I am showing only ingress rules. For the egress, we will just allow all traffic.

16.1. Adjust Security Group rules in AWS

In AWS we need to include ingress rules for the two new VCNs we’ve created in OCI. Then the PublicSG Security Group will have the following rules:

Screen Shot 2020-06-03 at 4.25.58 PM

16.2. Adjust Security List PubSL in VCN_OS

Let’s now adjust the Security List being used by our public Subnet in VCN_OS, by adding the ingress rules for the two new VCNs:

Screen Shot 2020-06-03 at 4.32.50 PM

16.3. Adjust the Spoke1’s Default Security List

In Spoke1 I’ll just allow all the incoming traffic from the other spoke VCN and the AWS VPC as well:

Screen Shot 2020-06-03 at 5.54.13 PM

16.4. Adjust the Spoke2’s Default Security List

Let’s do the same in Spoke2:

Screen Shot 2020-06-03 at 8.26.31 PM

17. Test the connectivity among the networks

17.1 Test connectivity between the Hub VCN (VCN_OS) and the spoke VCNs

Let’s see if the LPGs are working properly, trying to communicate between the Bastion host in VCN_OS and the two spoke VCNs:

scp -i ~/.ssh/ocikey ~/.ssh/ocikey opc@193.122.166.22:~/.ssh
ssh -i ~/.ssh/ocikey opc@193.122.166.22
ping 10.2.0.2
ping 10.3.0.2
ssh -i ~/.ssh/ocikey opc@10.2.0.2
exit
ssh -i ~/.ssh/ocikey opc@10.3.0.2

Screen Shot 2020-06-03 at 6.32.30 PM

17.2 Test connectivity between the Hub VCN and AWS

scp -i ~/.ssh/ocikey ~/.ssh/awskey.pem opc@193.122.166.22:~/.ssh
ssh -i ~/.ssh/ocikey opc@193.122.166.22 
ping 10.1.0.41
ping 10.1.0.199
ssh -i ~/.ssh/awskey.pem ec2-user@10.1.0.199

Screen Shot 2020-06-03 at 6.47.09 PM

17.3 Test connectivity between a spoke VCN and AWS

ssh -i ~/.ssh/ocikey opc@193.122.166.22 
scp -i ~/.ssh/ocikey ~/.ssh/awskey.pem opc@10.2.0.2:~/.ssh
ssh -i ~/.ssh/ocikey opc@10.2.0.2 
ping 10.1.0.199
ssh -i ~/.ssh/awskey.pem ec2-user@10.1.0.199

Screen Shot 2020-06-03 at 6.56.33 PM

To finish, it’s important to say that the Spoke VCNs cannot communicate with each other. If you wanted to allow that, you should create a new LPG in Spoke1 and Spoke2, connect them with each other, and add the proper rules in their Route Tables and Security Lists.

You may imagine, like I did, that simply creating a route rule in the route tables attached to the LPGs in the Hub VCN, pointing to the other LPG, would do the job. It would be something like this:

Screen Shot 2020-06-03 at 7.03.33 PM

But no, it doesn’t work. Oracle does not allow us to create a LPG-type rule in a Route Table attached to another LPG. See what happens if I try (note the red error message at the bottom-right of the screen):

Screen Shot 2020-06-03 at 6.27.30 PM

I don’t see a reason for it not to be allowed, but it is what it is right?

That’s it.

I hope you have enjoyed this long journey again. If you do, please let your comment, it’s important. Let me know if you have any doubt as well.

Thanks and see you later!


OCI Transit Routing (com vídeo)

Português/English

Olá, este é o terceiro post de uma série onde eu exploro as capacidades de conectividade entre a OCI e o mundo exterior usando IPSec VPN. No primeiro post, eu criei dois sites usando Virtualbox e os conectei usando Openswan, então não é diretamente relacionado com OCI. No segundo, usei AWS com Openswan de um lado, e o serviço IPSec VPN no outro, conectando assim as duas clouds. Agora, o que vou fazer aqui é usar os túneis IPSec e quase toda a configuração criada no post anterior, para não só conectar a uma VCN na OCI, mas a várias, usando OCI Transit Routing. Este conceito é explicado na documentação da OCI aqui.

Este tutorial está disponível também em vídeo aqui, em Português somente, para ajudar a comunidade Oracle no Brasil.

Espero que gostem.

Aqui está a arquitetura que vou construir:

Screen Shot 2020-06-03 at 3.04.37 PM

Veja que no lado da AWS a configuração é exatamente a mesma construída no post anterior. Na OCI, nós temos a mesma VCN_OS já criada, com sua subnet pública, mas agora teremos também duas “spoke” VCNs. A VCN_OS fará o papel de VCN Hub that will direcionar o tráfego entre a AWS e as Spoke VCNs em ambas as direções.

Teremos que criar também alguns Local Peering Gateways, para permitir a conexão entre as VCNs.

NOTA: se algum comando não funcionar por causa de privilégios, é provável que você não está executando como root. Para funcionar, apenas coloque “sudo” antes do comando, ou então execute “sudo su” pra se tornar root e então você poderá rodar qualquer comando.

Então vamos começar (ou continuar).

1. Repita os passos inicias para criar o túnel VPN

Todos os passos da AWS e os passos iniciais da OCI são os mesmos que em meu post Site-to-site IPSec VPN between AWS and OCI (with video).

Vou repetir aqui os passos a partir do #13, onde criamos o CPE e a conexão IPSec. Portanto, estou pulando os passos 1-12 e considerando que você tenha todos os passos executados corretamente antes de continuar. Se você nunca seguiu os passos daquele post, faça isso antes de continuar aqui.

Se seus túneis VPN estão no ar e funcionando OK, então você pode seguramente pular para o passo 8 deste post. Por outro lado, se os túneis não estiverem funcionando por qualquer razão, destrua o CPE e a conexão IPSec, e comece por aqui.

Anote os IPs e chaves secretas dos elementos da sua arquitetura. No meu caso eu tenho até o momento:

Openswan IP público: 18.220.224.1
Openswan IP privado: 10.1.0.41

Server1 IP público: 3.128.34.39
Server1 IP privado: 10.1.0.199

Screen Shot 2020-06-03 at 10.08.08 AM

Screen Shot 2020-06-03 at 10.08.18 AM

2. Crie um Customer Premise Equipment (CPE) e uma Conexão IPSec

Crie o CPE com o nome CPE_OS e associe-o ao IP público da VM Openswan na AWS (18.220.224.1). Selecione o vendor Libreswan (equivalente ao Openswan) e a versão indicada.

Screen Shot 2020-06-03 at 10.12.43 AM

Crie a conexão IPSec com o nome IPSEC_OS, ligada ao CPE e ao DRG criados anteriormente. Use uma rota estática com o CIDR 10.1.0.0/16 (que é o CIDR da VPC AWS).

Screen Shot 2020-06-03 at 10.13.54 AM

Depois de criar a conexão IPSec, você já pode ver os dois túneis redundantes criadas:

Screen Shot 2020-06-03 at 10.15.48 AM

Anote os IPs e chaves secretas (pra ver a chave secreta, clique no túnel e então em Show beside Shared Secret), vamos precisar deles pra configurar o Openswan:

Tunnel1 IP público: 129.213.6.34
Tunnel1 Chave secreta: “2hE2oW8bfQXPZKlC84DUdf071V6KaXUYgsuiXlYk9oSGHYzsoYwQ8gvMJuPhpaTR”

Tunnel2 IP público: 129.213.7.38
Tunnel2 Chave secreta: “7CcJXYUiKOayYB7c4p9czUQcLcFbtOFohA0fxvR79p8VnqOy8lrxjUIPKgnACVoz”

Depois de alguns minutos, você verá os túneis como disponíveis, mas com o status do IPSec ainda Down. Este status vai mudar para Up somente quando os dois lados estiverem propriamente configurados.

3. Configure o IPSEC na VM Openswan

Agora que você tem todos os IPs e recursos necessários para configurar os túneis, você pode voltar para a VM Openswan na AWS e configurá-la.

Crie ou ajuste o arquivo /etc/ipsec.d/oci-ipsec.conf com o seguinte conteúdo:

conn oracle-tunnel-1
	left=10.1.0.41
	leftid=18.220.224.1 # See preceding note about 1-1 NAT device
	right=129.213.6.34
	authby=secret
	leftsubnet=0.0.0.0/0 
	rightsubnet=0.0.0.0/0
	auto=start
	mark=5/0xffffffff # Needs to be unique across all tunnels
	vti-interface=vti1
	vti-routing=no
	ikev2=no # To use IKEv2, change to ikev2=insist
	ike=aes_cbc256-sha2_384;modp1536
	phase2alg=aes_gcm256;modp1536
	encapsulation=yes
	ikelifetime=28800s
	salifetime=3600s
conn oracle-tunnel-2
	left=10.1.0.41
	leftid=18.220.224.1 # See preceding note about 1-1 NAT device
	right=129.213.7.38
	authby=secret
	leftsubnet=0.0.0.0/0
	rightsubnet=0.0.0.0/0
	auto=start
	mark=6/0xffffffff # Needs to be unique across all tunnels
	vti-interface=vti2
	vti-routing=no
	ikev2=no # To use IKEv2, change to ikev2=insist
	ike=aes_cbc256-sha2_384;modp1536
	phase2alg=aes_gcm256;modp1536
	encapsulation=yes
	ikelifetime=28800s
	salifetime=3600s

Relembrando: os parâmetros left se referem ao site local (a VPC_OS na AWS), e os parâmetros right se referem ao site remoto (a VCN_OS na OCI).

Ajuste os seguintes parâmetros de acordo com os seus IPs:

left: IP privado do Openswan
leftif: IP público do Openswan
right: IPs dos tunnel1 e tunnel2 (coloque um em cada bloco “conn”)

ATENÇÃO: garanta que a indentação não será perdida no copy / paste, senão você terá problemas.

Screen Shot 2020-06-03 at 10.27.06 AM

4. Configure as chaves secretas na VM Openswan

Crie ou ajuste o arquivo /etc/ipsec.d/oci-ipsec.secrets com o seguinte conteúdo:

18.220.224.1 129.213.6.34: PSK “2hE2oW8bfQXPZKlC84DUdf071V6KaXUYgsuiXlYk9oSGHYzsoYwQ8gvMJuPhpaTR”
18.220.224.1 129.213.7.38: PSK “7CcJXYUiKOayYB7c4p9czUQcLcFbtOFohA0fxvR79p8VnqOy8lrxjUIPKgnACVoz”

Screen Shot 2020-06-03 at 10.29.29 AM

5. Reinicie IPSec na VM Openswan

sudo systemctl enable ipsec.service 
sudo systemctl restart ipsec.service
sudo systemctl status ipsec.service

Screen Shot 2020-06-03 at 10.34.27 AM

6. Verifique o status do túnel

Se o túnel estiver funcionando, você deverá ver duas interfaces vti na VM Openswan:

ifconfig | grep vti

Screen Shot 2020-06-03 at 10.35.01 AM

E vamos verificar se a conexão entre os dois lados foi estabelecida:

sudo ipsec status | grep established

Note que o Openswan mostra que os dois túneis já estão ativos e com a conexão estabelecida:

Screen Shot 2020-06-03 at 10.35.30 AM

Se você não receber nenhum resultado do comando acima, mesmo após esperar alguns minutos, é provável que você errou em alguma configuração.

Depois que o Openswan mostrar as conexões estabelecidas, espero um pouco mais e o console OCI deve também mostrar os túneis finalmente Up:

Screen Shot 2020-06-03 at 10.36.46 AM

7. Adicione as rotas necessárias na VM Openswan

Note que vamos adicionar rota para todas as redes que queremos acessar no site remot (OCI):

ip route add 10.0.0.0/16 nexthop dev vti1 nexthop dev vti2
ip route add 10.2.0.0/16 nexthop dev vti1 nexthop dev vti2
ip route add 10.3.0.0/16 nexthop dev vti1 nexthop dev vti2

Lembre-se que essas rotas não são persistentes. Toda vez que você reiniciar a VM precisará readicionar as rotas. Alternativamente, você pode fazê-las persistentes adicionando as seguintes linhas no arquivo /etc/sysconfig/network-scripts/route-eth0:

10.0.0.0/16 nexthop dev vti1 nexthop dev vti2
10.2.0.0/16 nexthop dev vti1 nexthop dev vti2
10.3.0.0/16 nexthop dev vti1 nexthop dev vti2

8. Crie um Bastion host na OCI

Nesta nova arquitetura, a VCN_OS (que será nossa VCN Hub) terá somente este Bastion host. Vamos acessá-lo através de SSH e a partir dele poderemos acessar todos os outros servidores na OCI.

Se você ainda tem o server2 criado na arquitetura prévia, destrua-a.

Vamos criar a VM Bastion na subnet pública da sua VCN_OS. Não se esqueça de fazer o upload do sua chave pública para poder se conectar via SSH depois. Lembre que essa Subnet está usando a Route Table PubRT e a Security List PubSL criadas anteriormente.

Screen Shot 2020-06-03 at 10.20.50 AM

Quando a VM estiver provisionada, anote os IPs dela:

Bastion IP público: 193.122.166.22
Bastion IP privado: 10.0.0.3

9. Accesse o Bastion via PING e SSH

ping 193.122.166.22
ssh -i ~/.ssh/ocikey opc@193.122.166.22

Screen Shot 2020-06-03 at 10.22.48 AM

10. Teste o ping entre o Server1 e o Bastion

Se o seu túnel está no ar e voce adicionou as rotas adequadamente na VM Openswan, o ping entre o Server1 (na AWS) e o Bastion (na OCI) deve funcionar pelos IPs privados:

ssh -i ~/.ssh/awskey.pem ec2-user@3.128.34.39
ping 10.0.0.3

Screen Shot 2020-06-03 at 10.54.21 AM

ssh -i ~/.ssh/ocikey opc@193.122.166.22
ping 10.1.0.199

Screen Shot 2020-06-03 at 10.55.25 AM

11. Teste o SSH entre o Server1 e o Bastion

Antes de testar o SSH, precisamos copiar as chaves privadas para os servidores.

Vamos copiar a chave privada da OCI para dentro do Server1 na AWS, e a chave privada da AWS para dentro do Bastion na OCI.

scp -i ~/.ssh/awskey.pem ~/.ssh/ocikey ec2-user@3.128.34.39:~/.ssh
scp -i ~/.ssh/ocikey ~/.ssh/awskey.pem opc@193.122.166.22:~/.ssh

Screen Shot 2020-06-03 at 11.00.49 AM

Agora vamos tentar conectar no Server1, e de lá fazer SSH para o Bastion (usando seu IP privado):

ssh -i ~/.ssh/awskey.pem ec2-user@3.128.34.39
# ssh -i ~/.ssh/ocikey opc@10.0.0.3

Screen Shot 2020-06-03 at 11.02.04 AM

E vamos fazer o caminho inverso também, conectando no Bastion primeiro e de lá fazer o SSH para o Server1 (usando seu IP privado):

ssh -i ~/.ssh/ocikey opc@193.122.166.22
# ssh -i ~/.ssh/awskey.pem ec2-user@10.1.0.199

Screen Shot 2020-06-03 at 11.04.48 AM

12. Crie as duas Spoke VCNs

Agora vamos começar a criar elementos novos. Como você pode ver no desenho da arquitetura, teremos duas novas VCNs na OCI, as spoke VCNs que aproveitarão o túnel IPSec pra se comunicar com o ambiente “on-premise” (na verdade a AWS).

Para estas duas VCNs vamos aproveitar a Route Table e a Security List default de cada uma das redes, para simplificar o processo.

Crie as VCNs conforme abaixo:

VCN Spoke1, CIDR 10.2.0.0/16, e uma subnet privada PrivSpoke1 com CIDR 10.2.0.0/24.

VCN Spoke2, CIDR 10.3.0.0/16, e uma subnet privada PrivSpoke2 com CIDR 10.3.0.0/24.

Screen Shot 2020-06-03 at 11.15.07 AM

Screen Shot 2020-06-03 at 11.15.37 AM

13. Crie os servidores Serversp1 e Serversp2

Vamos criar duas VMs com os nomes Serversp1 and Serversp2, nas subnets PrivSpoke1 and PrivSpoke2 respectivamente. Veja que estas instâncias terão somente IPs privados, portanto a única maneira de acessá-las via SSH será através do Bastion host. Não se esqueça de fazer o upload da sua chave ao criar as VMs.

Screen Shot 2020-06-03 at 11.22.44 AM

Screen Shot 2020-06-03 at 11.22.58 AM

Vamos anotar os IPs:

Serversp1 IP privado:

Serversp2 IP privado:

14. Ligue as VCNs

Para que as VCNs possam se comunicar, precisamos ligá-las usando Local Peering Gateways (LPG), porque elas estão na mesma região (se estivessem em regiões diferentes, precisaríamos usar um DRG pra estabelecer Remote Peering Connection).

Vamos precisar de dois LPGs na VCN_OS, e mais um nas VCNs Spoke1 e SPoke2. Vou nomeá-los como LPG_OS_Spoke1, LPG_OS_Spoke2, LPG_Spoke1_OS e LPG_Spoke2_OS respectivamente.

Depois de criá-los, estabeleça as conexões de acordo com o nosso desenho:

VCN_OS / LPG_OS_Spoke1 <===> Spoke1 / LPG_Spoke1_OS

VCN_OS / LPG_OS_Spoke2 <===> Spoke2 / LPG_Spoke2_OS

Note que você pode estabelecer a conexão a partir de qualquer lado, o resultado será o mesmo.

Screen Shot 2020-06-03 at 11.33.51 AM

15. Crie todas as regras de roteamento

Bom, os próximos dois passos são talvez os mais cansativos. Precisaremos criar um monte de regras de roteamento, não somente na OCI, mas também ajustar algumas na AWS.

Ao invés de descrever cada regra, o que seria muito tedioso, eu vou colocar um slide mostrando todas as Route Tables e as regras em cada uma:

Screen Shot 2020-06-03 at 5.42.32 PM

15.1. Ajuste as regras de roteamento na AWS

Vamos começar com a parte na AWS, em roxo no slide. Voc6e já tem lá a route table PubRT_OS, então inclua as regras para as duas novas VCNs (10.2 and 10.3).

Screen Shot 2020-06-03 at 3.25.27 PM

15.2. Crie uma Route Table para o DRG

Agora, de volta ao OCI, vamos criar uma Route Table para o DRG com o nome DRG_RT e as seguintes regras:

Screen Shot 2020-06-03 at 3.32.53 PM

15.3. Ajuste as regras de roteamento na subnet pública da VCN_OS

Vamos continuar com a Route Table PubRT ligada à subnet pública da VCN_OS. Ela já existe, mas tenha certeza de adicionar as regras para direcionar o tráfego para os LPGs:

Screen Shot 2020-06-03 at 3.40.41 PM

15.4. Crie a Route Table para o LPG_OS_Spoke1 na VCN_OS

Ainda na VCN_OS, crie a Route Table RT_LPG_OS_Spoke1 com as seguintes regras:

Screen Shot 2020-06-03 at 4.07.33 PM

15.5. Crie a Route Table para LPG_OS_Spoke2 na VCN_OS

Crie a Route Table RT_LPG_OS_Spoke2 com as seguintes regras:

Screen Shot 2020-06-03 at 4.10.34 PM

15.6. Ligue as novas Route Tables com o DRG e os LPGs

Primeiro ligue a DRG_RT to your DRG: na VCN_OS, clique en Dynamic Routing Gateway, te depois nos três pontos do lado do DRG e escolha Associate Route Table:

Screen Shot 2020-06-03 at 3.36.29 PM

Faça o mesmo para os dois LPGs:

Screen Shot 2020-06-03 at 4.13.59 PM

15.7. Ajuste a Default Route Table da VCN Spoke1

Como eu comentei antes, decidi usar a default route table e a default security list para as spoke VCNs. Então, ajuste a default route table da VCN Spoke1 para ter as seguintes regras:

Screen Shot 2020-06-03 at 5.48.03 PM

15.8. Ajuste a Default Route Table da VCN Spoke2

Da mesma maneira, ajuste a default route table da VCN Spoke2 para ter as seguintes regras:

Screen Shot 2020-06-03 at 5.48.36 PM

16. Crie todas as regras de firewall (Security Lists / Groups)

Eu também preparei um slide para mostrar todas as Security Lists e Security Groups que serão criados ou ajustados:

Screen Shot 2020-06-03 at 5.44.57 PM

Note que estou mostrando apenas as regras de entrada. Para saída, vamos permitir todo o tráfego em todas as listas.

16.1. Ajuste a Security Group na AWS

Na AWS precisamos incluir as regas de entrada para as duas novas VCNs que criamos na OCI. A PublicSG deve ter as seguintes regras:

Screen Shot 2020-06-03 at 4.25.58 PM

16.2. Ajuste a Security List PubSL na VCN_OS

Vamos agora ajustar a Security List being usada pela subnet pública na VCN_OS, adicionando as regras de entrada para as duas novas VCNs:

Screen Shot 2020-06-03 at 4.32.50 PM

16.3. Ajuste a Default Security List da VCN Spoke1

Na VCN Spoke1 eu vou permitir todo o tráfego vindo da outra spoke VCN e a VPC AWS também:

Screen Shot 2020-06-03 at 5.54.13 PM

16.4. Ajuste a Default Security List da VCN Spoke2

Na VCN Spoke2 vamos fazer a mesma coisa:

Screen Shot 2020-06-03 at 8.26.31 PM

17. Teste a comunicação entre as redes

17.1 Teste a comunicação entre a Hub VCN (VCN_OS) e as spoke VCNs

Vamos ver se os LPGs estão funcionando, tentando a comunicação entre o Bastion host na VCN_OS e as duas spoke VCNs:

scp -i ~/.ssh/ocikey ~/.ssh/ocikey opc@193.122.166.22:~/.ssh
ssh -i ~/.ssh/ocikey opc@193.122.166.22
ping 10.2.0.2
ping 10.3.0.2
ssh -i ~/.ssh/ocikey opc@10.2.0.2
exit
ssh -i ~/.ssh/ocikey opc@10.3.0.2

Screen Shot 2020-06-03 at 6.32.30 PM

17.2. Teste a comunicação entre a Hub VCN and AWS

scp -i ~/.ssh/ocikey ~/.ssh/awskey.pem opc@193.122.166.22:~/.ssh
ssh -i ~/.ssh/ocikey opc@193.122.166.22 
ping 10.1.0.41
ping 10.1.0.199
ssh -i ~/.ssh/awskey.pem ec2-user@10.1.0.199

Screen Shot 2020-06-03 at 6.47.09 PM

17.3 Teste a comunicação entre uma spoke VCN and AWS

ssh -i ~/.ssh/ocikey opc@193.122.166.22 
scp -i ~/.ssh/ocikey ~/.ssh/awskey.pem opc@10.2.0.2:~/.ssh
ssh -i ~/.ssh/ocikey opc@10.2.0.2 
ping 10.1.0.199
ssh -i ~/.ssh/awskey.pem ec2-user@10.1.0.199

Screen Shot 2020-06-03 at 6.56.33 PM

Pra terminar, é importante dizer que as Spoke VCNs não conseguem se comunicar entre si. Se você quiser permitir isso, terá que criar novos LPGs nas spoke VCNs, ligá-los e criar as regras de roteamento e firewall.

Você pode imaginar, como eu fiz, que bastaria criar rotas de roteamento entre os dois LPGs da VCN_OS (a Hub), apontando um para o outro. Seria algo assim:

Screen Shot 2020-06-03 at 7.03.33 PM

Mas não, isso não funciona. A OCI não permite que nós criemos uma regra do tipo LPG dentro de uma Route Table que está associada a outro LPG. Veja o que acontece se eu tento (note a mensagem de erro em vermelho no canto inferior direito):

Screen Shot 2020-06-03 at 6.27.30 PM

Não entendo porque isso não é permitido, mas é assim.

Bom, é isso!

Espero que você tenha gostado de mais esse post enorme. Deu um enorme trabalho, mas sempre vale a pena.

Se você gostou, por favor deixe seu comentário. É importante!

Me avise se tiver dúvidas também.

Obrigado e até a próxima.

Site-to-site IPSec VPN between AWS and OCI (with video)

 

Português/English

Hi, as I mentioned in my previous post, after configuring a site-to-site VPN between two Openswan VMs in different networks using Virtualbox, now I’m gonna do the same but using two clouds: AWS and OCI.

As my target is to study OCI, I will use AWS to emulate an “on premise” site, with an Openswan server at the end of the tunnel. On the other side, in OCI, I will use the IPSec VPN service that is my real learning target in this process.

This tutorial is available in video as well here, in Portuguese only, to help Oracle community in Brazil.

I hope you enjoy.

Here is architecture I’m going to build:

Screen Shot 2020-06-01 at 3.03.24 PM

On the left side we have AWS, emulating the on premise data center, and on the right side we have the OCI cloud. They will be connected through the VPN.

Here is a description of the elements of my architecture:

AWS VPC_OS – the AWS network where the Openswan and Server1 servers will reside, in a public subnet.

OCI VCN_OS – the OCI network where Server2 will reside. Remember that it’s not necessary to create an Openswan server in OCI because I’m going to use the IPSec VPN service.

On both networks I will create an Internet Gateway to allow Internet access from any VM. It’s important to say that I’m creating only a public subnet on each side to simplify the SSH access, but the best approach in terms of security would be to have Server1 and Server2 under private subnets.

In OCI, I will need to create a CPE (Customer Premise Equipment), to represent the device that will receive the tunnel on the other side (in our case, the Openswan server). I will also create a DRG (Dynamic Routing Gateway) and configurar the IPSec service.

I used different colors in the connections to show the connectivity paths available.

Note: beyond several websites and blogs I researched to adjust and debug my process, the main document I followed to create it is in this link, which I recommend you to read.

Ok, let’s start the process.

1. Create a new AWS VPC

Note: I am assuming you have a Free Tier account on AWS, and the basic knowledge needed to create and configure VPCs, Subnets and other basic resources I’m gonna use. That’s why I won’t put many screen shots of several basic steps here. If you don’t have this basic knowledge, it’s pretty easy to find in the Internet tutorials on how to create these basic elements I’m using here 😉 .

We will create a new VPC on AWS with the name VPC_OS using CIDR 10.1.0.0/16, and a public subnet associated with it using 10.1.0.0/24. Configure the subnet to automatically associate public IPs.

Also create an Internet Gateway and associate it with your VPC.

2. Create a Security Group

We will now create a Security Group with the name PublicSG in our VPC, with the following rules:

Screen Shot 2020-06-01 at 3.48.44 PMScreen Shot 2020-06-01 at 3.48.51 PM

Note that I am allowing all outgoing traffic. Rules on ports 500 and 4500 are required for Openswan’s IPSec VPN service to work. SSH is for allowing connections and ICMP is for allowing pings.

3. Crie uma instância para o Openswan

For the AWS instances we are going to create, we can accept the default image and type, which are sufficient and allow the use of the free service from Amazon (Free Tier).

In the instance configuration details, ensure that you are selecting the VPC and Subnet correctly, and turning on the public IP association.

Screen Shot 2020-06-01 at 4.28.22 PM

Also make sure you select the correct Security Group, which we created earlier:

Screen Shot 2020-06-01 at 4.30.00 PM

After a few minutes your instance will be created. Change its name to Openswan. Write down the IPs associated with it:

Openswan public IP: 18.222.77.165
Openswan private IP: 10.1.0.227

Screen Shot 2020-06-01 at 4.33.01 PM

After creating the instance, select it and click the Connect button to associate a key for SSH connection. If you don’t have a key, you can ask to generate it and then use it in other instances created as well.

4. Create another instance for Server1

Select the same options used to create the previous one. Write down the IPs of this one as well:

Server1 public IP: 3.12.166.223

Server1 private IP: 10.1.0.206

Screen Shot 2020-06-01 at 4.36.51 PM

5. Create a Route Table

Let’s create a Route Table with the name PubRT_OS, associated with our VPC’s public subnet, and add the following rules:

Destination: 0.0.0.0/0, Target: Internet Gateway.

Destination: 10.0.0.0/16, Target: a instância do Openswan.

Screen Shot 2020-06-01 at 3.55.16 PM

Note that I am already including a rule to route traffic to the remote subnet to be created later in OCI, with CIDR 10.0.0.0/16, to the tunnel we will build in the Openswan VM.

6. Connect on both instances and test ping between them

Openswan:

ping 18.222.77.165
ssh -i ~/.ssh/awskey.pem ec2-user@18.222.77.165

Screen Shot 2020-06-01 at 4.48.55 PM

Server1:

ping 3.12.166.223
ssh -i ~/.ssh/awskey.pem ec2-user@3.12.166.223

Screen Shot 2020-06-01 at 4.50.59 PM

7. Install Openswan on Openswan VM

sudo yum install openswan lsof

8. Configure Openswan VM

First of all, go to the AWS console, select the Openswan instance and then select Actions -> Network -> Change Source/Dest Checking to turn off this check.

The, back to the terminal, adjust the following kernel parameters to enable IP forwarding and disable redirects:

Edit /etc/sysctl.conf and change/include:

net.ipv4.ip_forward=1
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.all.send_redirects = 0
net.ipv4.conf.default.send_redirects = 0
net.ipv4.conf.eth0.send_redirects = 0
net.ipv4.conf.default.accept_redirects = 0
net.ipv4.conf.eth0.accept_redirects = 0

To validate the changes execute:

sudo sysctl -p

9. Create the VCN in OCI

At this point we are going to stop working on the AWS console, and start configuring the destination site for our tunnel, at OCI.

We are going to create the following elements (again, I assume you already have some basic knowledge of OCI to create them).

VCN: name VCN_OS, CIDR 10.0.0.0/16. Create an Internet Gateway within the VCN.

10. Create a Dynamic Routing Gateway (DRG)

Create a DRG and attach it to VCN_OS.

Screen Shot 2020-06-01 at 5.13.41 PM

11. Create a Route Table and a Security List

Route Table: name PubRT, with the following rules:

Destination: 10.1.0.0/16 (AWS CIDR), Target: DRG
Destination: 0.0.0.0/0, Target: IG

Screen Shot 2020-06-01 at 5.10.26 PM

Security List: name PubSL, with the following rules:

Screen Shot 2020-06-01 at 5.05.22 PMScreen Shot 2020-06-01 at 5.05.37 PM

Note that the rules are pretty much the same as those created on AWS.

12. Create a public Subnet

Subnet: name Pub1, CIDR 10.0.0.0/24, public. Attached to the Security List PubSL and the Route Table PubRT.

Screen Shot 2020-06-01 at 5.14.29 PM

13. Create a Customer Premise Equipment (CPE)

The CPE is the representation in OCI of you VPN device in the remote site. In our case, it’s the Openswan VM in AWS.

Create the CPE with the name CPE_OS and associate it to the public IP of the Openswan in AWS (18.222.77.165). Select the vendor Libreswan (equivalent to Openswan) and the indicated version.

Screen Shot 2020-06-01 at 5.15.22 PM

14. Create an IPSec connection

Crie a conexão IPSec com o nome IPSEC_OS, associada ao CPE e ao DRG anteriormente criados. Use uma rota estática com o CIDR 10.1.0.0/16 (que é o CIDR da VPC AWS).

Create the IPSec connection with the name IPSEC_OS, attached to the CPE and DRG previously created. Use a static route with CIDR 10.1.0.0/16 (which is the AWS VPC’s CIDR).

Screen Shot 2020-06-01 at 5.22.11 PM

After creating the IPSec connection, it will change the status to Provisioning, and you will be able to see the two redundant tunnels created:

Screen Shot 2020-06-01 at 5.26.00 PM

Let’s write down the IPs and secret keys created for each of the tunnels, because we will need then to configure Openswan:

Screen Shot 2020-06-01 at 5.29.01 PM

Tunnel 1: IP 129.213.7.37, key “aZpb5EpqSbuGcfw0zlVvXeHdYizxJSEvMBLkoI9KfiHgUBGIs7wuJ1992oT4pgV6”

Tunnel 2: 129.213.6.35, key “bnSulLZhPsKLUjDYsbNSg2RZ5SMAri9LCdXMstDpF7vYS0GEC58BsnnpBfLL8qBB”

After a while, you will see both tunnels as available, but with the IPSec status still Down. This status will change to Up only when both sides are configured properly.

Screen Shot 2020-06-01 at 5.32.04 PM

15. Create an instance Server2 in OCI

We need to create Server2 at OCI. It is worth remembering that on this side we do not need an instance for Openswan, since OCI’s IPSec VPN service will do the job.

Let’s create the Server2 instance attached to the public subnet Pub1 of our VCN VCN_OS. Don’t forget to upload your public key so you can connect via SSH later.

When the instance is provisioned, write down its IPs:

Server2 public IP: 129.213.168.185

Server2 private IP: 10.0.0.3

Screen Shot 2020-06-01 at 5.37.36 PM

16. Access Server2 via PING e SSH

ping 129.213.168.185
ssh -i ~/.ssh/ocikey opc@129.213.168.185

Screen Shot 2020-06-01 at 5.41.33 PM

17. Configure IPSEC in Openswan VM

Create the file /etc/ipsec.d/oci-ipsec.conf with the following content:

conn oracle-tunnel-1
	left=10.1.0.227
	leftid=18.222.77.165 # See preceding note about 1-1 NAT device
	right=129.213.7.37
	authby=secret
	leftsubnet=0.0.0.0/0 
	rightsubnet=0.0.0.0/0
	auto=start
	mark=5/0xffffffff # Needs to be unique across all tunnels
	vti-interface=vti1
	vti-routing=no
	ikev2=no # To use IKEv2, change to ikev2=insist
	ike=aes_cbc256-sha2_384;modp1536
	phase2alg=aes_gcm256;modp1536
	encapsulation=yes
	ikelifetime=28800s
	salifetime=3600s
conn oracle-tunnel-2
	left=10.1.0.227
	leftid=18.222.77.165 # See preceding note about 1-1 NAT device
	right=129.213.6.35
	authby=secret
	leftsubnet=0.0.0.0/0
	rightsubnet=0.0.0.0/0
	auto=start
	mark=6/0xffffffff # Needs to be unique across all tunnels
	vti-interface=vti2
	vti-routing=no
	ikev2=no # To use IKEv2, change to ikev2=insist
	ike=aes_cbc256-sha2_384;modp1536
	phase2alg=aes_gcm256;modp1536
	encapsulation=yes
	ikelifetime=28800s
	salifetime=3600s

Note that we are creating two VPN connections, pointing to the OCI tunnel IPs. The left parameter indicates Openswan’s private IP, and leftid indicates the public IP.

ATTENTION: ensure that the indentation won’t be lost in the copy / paste process, otherwise you will have problems.

18. Configure the secret keys file in Openswan VM

Create the file /etc/ipsec.d/oci-ipsec.secrets with the following content:

18.222.77.165 129.213.7.37: PSK "aZpb5EpqSbuGcfw0zlVvXeHdYizxJSEvMBLkoI9KfiHgUBGIs7wuJ1992oT4pgV6"
18.222.77.165 129.213.6.35: PSK "bnSulLZhPsKLUjDYsbNSg2RZ5SMAri9LCdXMstDpF7vYS0GEC58BsnnpBfLL8qBB"

19. Restart IPSec in Openswan VM

sudo systemctl enable ipsec.service 
sudo systemctl restart ipsec.service
sudo systemctl status ipsec.service

Screen Shot 2020-06-01 at 5.54.06 PM

20. Check the tunnel status

If the tunnel is working, you should see two vti interfaces on the Openswan VM:

ifconfig | grep vti

Screen Shot 2020-06-01 at 5.55.48 PM

And let’s check if the connection between the two sides has been established:

sudo ipsec status | grep established

Note that Openswan shows the two tunnels are already active and with the connections established:

Screen Shot 2020-06-01 at 6.14.19 PM

If you do not get any results from the above command, even if you wait a few minutes, it is likely that something was wrong with the configuration.

After Openswan shows the connection has been established, wait a little longer and the OCI console should also show the tunnels are finally up:

Screen Shot 2020-06-01 at 6.17.16 PM

21. Add the proper routes in Openswan VM

Add the following route in Openswan VM:

ip route add 10.0.0.0/16 nexthop dev vti1 nexthop dev vti2

If you want to make the route persistent, execute:

echo 10.0.0.0/16 nexthop dev vti1 nexthop dev vti2 >> /etc/sysconfig/network-scripts/route-eth0

22. Test ping between Server1 and Server2

Now, with everything set, pings between Server1 (on AWS) and Server2 (on OCI) should work:

From Server1 (AWS):

ping 

Screen Shot 2020-06-01 at 7.51.06 PM

From Server2 (OCI):

ping 10.1.0.206

Screen Shot 2020-06-01 at 7.52.13 PM

If necessary, use tcpdump to debug.

23. Test SSH between Server1 e Server2

Before testing SSH, we need to copy the private keys into the servers.

Let’s copy the OCI private key into Server1 on AWS, and the AWS private key into Server2 on OCI.

scp -i ~/.ssh/awskey.pem ~/.ssh/ocikey ec2-user@3.12.166.223:~/.ssh
scp -i ~/.ssh/ocikey ~/.ssh/awskey.pem opc@129.213.168.185:~/.ssh

Now let’s try to connect to Server1, and from there do SSH to Server2 (using its private IP):

ssh -i ~/.ssh/awskey.pem ec2-user@3.12.166.223
# ssh -i ~/.ssh/ocikey opc@10.0.0.3

Screen Shot 2020-06-01 at 8.00.16 PM

And to finish, let’s do the opposite test, connecting to Server2 first and then doing SSH to Server1 (using its private IP):

ssh -i ~/.ssh/ocikey opc@129.213.168.185
# ssh -i ~/.ssh/awskey.pem ec2-user@10.1.0.206

Screen Shot 2020-06-01 at 8.00.44 PM

And that’s it, I hope you like it and also manage to make it work.

Again, a very long post that demanded a lot of work, but definitely worth it!

See you next time.

Site-to-site IPSec VPN entre AWS e OCI (com vídeo)

Português/English

Olá, como comentei no meu post anterior, após ter configurar uma VPN site-to-site entre duas VMs Openswan em redes diferentes no Virtualbox, agora vou fazer o procedimento utilizando duas clouds: AWS e OCI.

Como meu objetivo de estudo é o OCI, do lado da AWS eu vou criar emular um ambiente “on premise”, com o servidor Openswan na ponta do túnel. Do outro lado do túnel, na OCI, vou utilizar o serviço específico de IPSec VPN que é o que eu realmente quero ver funcionando.

Este tutorial também está disponível em vídeo aqui, em Português somente, voltado para a comunidade Oracle no Brasil.

Espero que gostem.

Aqui está o desenho da arquitetura que eu vou montar:

Screen Shot 2020-06-01 at 3.03.24 PM

Do lado esquerdo temos a rede na AWS, emulando o data center on premise, e do lado direito a cloud OCI, que serão conectados através da VPN.

Abaixo descrevo os elementos da minha arquitetura:

AWS VPC_OS – é a rede criada na AWS onde estarão o servidor do Openswan e o Server1. Nesta rede vou criar uma Subnet pública para conter os dois servidores.

OCI VCN_OS – é a rede criada na OCI para conter o Server2. Vale lembrar que não é necessário criar um servidor de Openswan na OCI porque vou utilizar o serviço de IPSec VPN.

Nas duas clouds eu vou criar um Internet Gateway para permitir o acesso à Internet de e para qualquer das VMs. É importante salientar que estou usando apenas uma Subnet pública de cada lado para simplificação do processo (conexão SSH), mas o mais correto do ponto de vista de segurança seria ter as VMs Server1 e Server2 em subnets privadas.

Na rede OCI, precisarei criar um CPE (Customer Premise Equipment), para representar o equipamento que vai receber o túnel do outro lado (no nosso caso, o servidor Openswan). Também precisarei de um DRG (Dynamic Routing Gateway) e configurar o serviço de IPSec.

Usei cores diferentes nas conexões para demonstrar os caminhos de conectividade possíveis no desenho.

Obs.: além de vários outros sites e blogs na Internet que fui pesquisando para fazer ajustes e debug do processo, o principal documento que segui para criar este procedimento está neste link, que recomendo a leitura.

Ok, vamos começar o processo então.

1. Crie uma nova VPC na AWS

Obs.: estou assumindo aqui que você tem uma conta Free Tier na AWS, e tem o conhecimento mínimo necessário para criar e configurar VPCs, Subnets e outros recursos básicos que vou utilizar aqui. Por isso não vou colocar print screen de vários passos porque isso tornaria o posto muito longo sem necessidade. Se você não tem este conhecimento mínimo, é bem fácil encontrar na Internet tutoriais de como criar estes elementos que estou usando 😉 .

Vamos criar uma nova VPC na AWS com o nome VPC_OS usando o CIDR 10.1.0.0/16, e uma subnet pública associada a ela usando 10.1.0.0/24. Configure a subnet para associar IPs públicos automaticamente.

Também crie um Internet Gateway e faça a associação dele com a sua VPC.

2. Crie um Security Group

Vamos agora criar um Security Group com o nome PublicSG na nossa VPC, com as seguintes regras:

Screen Shot 2020-06-01 at 3.48.44 PMScreen Shot 2020-06-01 at 3.48.51 PM

Note que estou permitindo todo o tráfego de saída. As regras nas portas 500 e 4500 são necessárias para o serviço de IPSec VPN do Openswan funcionar. O SSH é para permitir conexões e o ICMP para permitir pings.

3. Crie uma instância para o Openswan

Para as instâncias AWS que vamos criar, podemos aceitar a imagem e tipo padrão, que nos são suficientes e permitem o uso do serviço grátis da Amazon (Free Tier).

Nos detalhes de configuração da instâncias, garanta que está selecionando corretamente a VPC e Subnet, e ligando o associação de IP público.

Screen Shot 2020-06-01 at 4.28.22 PM

Também garanta que selecionou o Security Group correto, que criamos anteriormente:

Screen Shot 2020-06-01 at 4.30.00 PM

Após poucos minutos sua instância estará criada. Altere o nome dela na console para Openswan. Anote os IPs associados a ela:

Openswan IP público: 18.222.77.165
Openswan IP privado: 10.1.0.227

Screen Shot 2020-06-01 at 4.33.01 PM

Depois de criar a instância, selecione-a e clique no botão Connect para associar uma chave para conexão SSH. Se você não tiver uma chave, pode pedir para gerar e depois utilizar nas outras instâncias criadas também.

4. Crie uma outra instância para o Server1

Utilize todas as opções iguais às da criação anterior. Vamos anotar também os IPs desta VM:

Server1 IP público: 3.12.166.223

Server1 IP privado: 10.1.0.206

Screen Shot 2020-06-01 at 4.36.51 PM

5. Crie uma Route Table

Vamos criar uma Route Table com o nome PubRT_OS, associada com a Public Subnet da nossa VPN.

Destination: 0.0.0.0/0, Target: Internet Gateway.

Destination: 10.0.0.0/16, Target: a instância do Openswan.

Screen Shot 2020-06-01 at 3.55.16 PM

Veja que já estou criando uma regra de roteamento para direcionar o tráfego da subnet remota para o túnel que será criado a partir da VM Openswan. O CIDR mencionado, 10.0.0.0/16, será utilizado no outro lado do túnel, na VCN da OCI.

6. Conecte-se nas instâncias e faça teste e ping entre elas

Openswan:

ping 18.222.77.165
ssh -i ~/.ssh/awskey.pem ec2-user@18.222.77.165

Screen Shot 2020-06-01 at 4.48.55 PM

Server1:

ping 3.12.166.223
ssh -i ~/.ssh/awskey.pem ec2-user@3.12.166.223

Screen Shot 2020-06-01 at 4.50.59 PM

7. Instale o Openswan and na VM Openswan

sudo yum install openswan lsof

8. Configure a VM do Openswan

Primeiro, vá na consolo do AWS, selecione a instância do Openswan e então selecione Actions -> Network -> Change Source/Dest Checking e desligue esta verificação.

Em seguida, de volta ao terminal, ajuste os parâmetros de kernel para ligar o IP forwarding e desligar os redirecionamentos:

Edite o arquivo /etc/sysctl.conf e altere/inclua:

net.ipv4.ip_forward=1
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.all.send_redirects = 0
net.ipv4.conf.default.send_redirects = 0
net.ipv4.conf.eth0.send_redirects = 0
net.ipv4.conf.default.accept_redirects = 0
net.ipv4.conf.eth0.accept_redirects = 0

Para validar as alterações execute:

sudo sysctl -p

9. Crie a VCN na OCI

Neste momento vamos deixar de trabalhar no console da AWS, e passar a configurar o site destino do nosso túnel, na OCI.

Na OCI vamos criar os seguintes elementos (mais uma vez, assumo que você já tenha algum conhecimento básico de OCI para criar estes recursos).

VCN: nome VCN_OS, CIDR 10.0.0.0/16. Crie um Internet Gateway dentro da VCN.

10. Crie um Dynamic Routing Gateway (DRG)

Crie um DRG e o conecte à VCN_OS.

Screen Shot 2020-06-01 at 5.13.41 PM

11. Crie uma Route Table e uma Security List

Route Table: nome PubRT, com as seguintes regras:

Destination: 10.1.0.0/16 (AWS CIDR), Target: DRG
Destination: 0.0.0.0/0, Target: IG

Screen Shot 2020-06-01 at 5.10.26 PM

Security List: nome PubSL, associada à subnet pública, com as seguintes regras:

Screen Shot 2020-06-01 at 5.05.22 PMScreen Shot 2020-06-01 at 5.05.37 PM

Veja que as regras são praticamente iguais às criadas do lado da AWS.

12. Crie uma Subnet pública

Subnet: nome Pub1, CIDR 10.0.0.0/24, pública. Associada com a Security List PubSL e a Route Table PubRT.

Screen Shot 2020-06-01 at 5.14.29 PM

13. Crie um Customer Premise Equipment (CPE)

O CPE é a representação na OCI do seu equipamento que será a ponta do túnel VPN no site remoto. No nosso caso, é a VM Openswan na AWS.

Crie um CPE com o nome CPE_OS e o associe ao IP público do Openswan na AWS (18.222.77.165). Selecione o vendor Libreswan (equivalente ao Openswan) e a versão indicada.

Screen Shot 2020-06-01 at 5.15.22 PM

14. Crie uma conexão IPSec

Crie a conexão IPSec com o nome IPSEC_OS, associada ao CPE e ao DRG anteriormente criados. Use uma rota estática com o CIDR 10.1.0.0/16 (que é o CIDR da VPC AWS).

Screen Shot 2020-06-01 at 5.22.11 PM

Após a criação da conexão IPSec, ela mudará para o status de Provisioning, e você já poderá ver os dois túneis redundantes criados:

Screen Shot 2020-06-01 at 5.26.00 PM

Vamos anotar os IPs e as chaves secretas criadas para cada um dos túneis, pois precisaremos destes dados para a configuração do Openswan:

Screen Shot 2020-06-01 at 5.29.01 PM

Túnel 1: IP 129.213.7.37, chave “aZpb5EpqSbuGcfw0zlVvXeHdYizxJSEvMBLkoI9KfiHgUBGIs7wuJ1992oT4pgV6”

Túnel 2: 129.213.6.35, chave “bnSulLZhPsKLUjDYsbNSg2RZ5SMAri9LCdXMstDpF7vYS0GEC58BsnnpBfLL8qBB”

Após algum tempo, você verá os dois túneis disponíveis, mas com o status IPSec ainda como Down. Este status só vai mudar para Up quando os dois lados estiverem configurados e funcionando.

Screen Shot 2020-06-01 at 5.32.04 PM

15. Crie a instância Server2 no OCI

Precisamos criar o Server2 na OCI. Vale lembrar que deste lado nós não precisamos de uma instância de Openswan, já que o serviço de IPSec VPN da OCI fará o serviço.

Vamos criar a instância Server2 ligada à subnet pública Pub1 da nossa VCN VCN_OS. Não se esqueça de carregar sua chave pública para depois conseguir conectar via SSH.

Quando a instância estiver provisionada, anote os IPs que fora associados a ela:

Server2 IP público: 129.213.168.185

Server2 IP privado: 10.0.0.3

Screen Shot 2020-06-01 at 5.37.36 PM

16. Acesse o Server2 via PING e SSH

ping 129.213.168.185
ssh -i ~/.ssh/ocikey opc@129.213.168.185

Screen Shot 2020-06-01 at 5.41.33 PM

17. Configure o IPSEC na VM Openswan

Crie o arquivo /etc/ipsec.d/oci-ipsec.conf com o seguinte conteúdo:

conn oracle-tunnel-1
	left=10.1.0.227
	leftid=18.222.77.165 # See preceding note about 1-1 NAT device
	right=129.213.7.37
	authby=secret
	leftsubnet=0.0.0.0/0 
	rightsubnet=0.0.0.0/0
	auto=start
	mark=5/0xffffffff # Needs to be unique across all tunnels
	vti-interface=vti1
	vti-routing=no
	ikev2=no # To use IKEv2, change to ikev2=insist
	ike=aes_cbc256-sha2_384;modp1536
	phase2alg=aes_gcm256;modp1536
	encapsulation=yes
	ikelifetime=28800s
	salifetime=3600s
conn oracle-tunnel-2
	left=10.1.0.227
	leftid=18.222.77.165 # See preceding note about 1-1 NAT device
	right=129.213.6.35
	authby=secret
	leftsubnet=0.0.0.0/0
	rightsubnet=0.0.0.0/0
	auto=start
	mark=6/0xffffffff # Needs to be unique across all tunnels
	vti-interface=vti2
	vti-routing=no
	ikev2=no # To use IKEv2, change to ikev2=insist
	ike=aes_cbc256-sha2_384;modp1536
	phase2alg=aes_gcm256;modp1536
	encapsulation=yes
	ikelifetime=28800s
	salifetime=3600s

Veja que estamos criando duas conexões VPN, apontando para os IPs dos dois túneis criados na OCI. O parâmetro left indica o IP privado do Openswan, e o leftid indica o IP público.

ATENÇÃO: garanta que a indentação não será perdida no processo de copiar/colar, senão você terá problemas.

18. Configure o arquivo de chaves secretas na VM Openswan

Crie o arquivo /etc/ipsec.d/oci-ipsec.secrets com o seguinte conteúdo:

18.222.77.165 129.213.7.37: PSK "aZpb5EpqSbuGcfw0zlVvXeHdYizxJSEvMBLkoI9KfiHgUBGIs7wuJ1992oT4pgV6"
18.222.77.165 129.213.6.35: PSK "bnSulLZhPsKLUjDYsbNSg2RZ5SMAri9LCdXMstDpF7vYS0GEC58BsnnpBfLL8qBB"

19. Reinicie o serviço de IPSec na VM Openswan

sudo systemctl enable ipsec.service 
sudo systemctl restart ipsec.service
sudo systemctl status ipsec.service

Screen Shot 2020-06-01 at 5.54.06 PM

20. Verifique o status do túnel

Se o túnel estiver funcionando, você deverá ver duas interfaces vti na VM Openswan:

ifconfig | grep vti

Screen Shot 2020-06-01 at 5.55.48 PM

E vamos verificar se a conexão entre os dois lados foi estabelecida:

sudo ipsec status | grep established

Veja que o Openswan mostra que os dois túneis já estão ativos e com a conexão estabelecida:

Screen Shot 2020-06-01 at 6.14.19 PM

Se você não obtiver nenhum resultado no comando acima, mesmo aguardando alguns minutos, é provável que tenha errado algo na configuração.

Após o Openswan informar que a conexão foi estabelecida, aguarde um pouco mais e a console do OCI também deverá informar que os túneis estão finalmente Up:

Screen Shot 2020-06-01 at 6.17.16 PM

21. Adicione as rotas apropriadas no servidor do Openswan

Adicione a seguinte rota no servidor do Openswan:

ip route add 10.0.0.0/16 nexthop dev vti1 nexthop dev vti2

Se quiser fazer a rota ficar persistente, execute o comando abaixo:

echo 10.0.0.0/16 nexthop dev vti1 nexthop dev vti2 >> /etc/sysconfig/network-scripts/route-eth0

22. Teste o ping entre os servidores Server1 e Server2

Agora, com tudo configurado, os pings entre os servidores Server1 (na AWS) e Server2 (na OCI) devem funcionar:

A partir do Server1 (AWS):

ping 

Screen Shot 2020-06-01 at 7.51.06 PM

A partir do Server2 (OCI):

ping 10.1.0.206

Screen Shot 2020-06-01 at 7.52.13 PM

Se necessário, use o comando tcpdump para fazer o debug.

23. Teste o SSH entre os servidores Server1 e Server2

Antes de testar o SSH, precisamos copiar as chaves privadas para dentro dos servidores.

Vamos copiar a chave privada pra OCI para dentro do Server1 na AWS, e a chave privada da AWS para dentro do Server2 na OCI.

scp -i ~/.ssh/awskey.pem ~/.ssh/ocikey ec2-user@3.12.166.223:~/.ssh
scp -i ~/.ssh/ocikey ~/.ssh/awskey.pem opc@129.213.168.185:~/.ssh

Agora vamos tentar conectar no Server1, e de lá fazer SSH para o Server2 (pelo IP privado):

ssh -i ~/.ssh/awskey.pem ec2-user@3.12.166.223
# ssh -i ~/.ssh/ocikey opc@10.0.0.3

Screen Shot 2020-06-01 at 8.00.16 PM

E para terminar, vamos fazer o teste contrário, conectando primeiro no Server2 e de lá fazer SSH para o Server1 (pelo IP privado):

ssh -i ~/.ssh/ocikey opc@129.213.168.185
# ssh -i ~/.ssh/awskey.pem ec2-user@10.1.0.206

Screen Shot 2020-06-01 at 8.00.44 PM

E é isso pessoal, espero que gostem e que consigam também fazer o teste com o serviço de IPSec VPN da OCI.

Novamente, um post bastante longo e que deu muito trabalho, mas valeu a pena!

Um abraço e até a próxima.

Site-to-site IPSec VPN with Virtualbox and Openswan (with video)

 

Português/English

In this post I will demonstrate all the steps to build a site-to-site IPSec VPN tunnel, using Virtualbox VMs and networks, and the software Openswan.

My target when creating and testing this procedure was to learn a little bit about the concepts and configuration of a site-to-site VPN, for later to apply it in a VPN configuration on Oracle Cloud Infrastructure (OCI). My next post will be exactly about it.

Here I will go through all the process, since installing an Operating System, configuring Openswan, the IPSec tunnel and perform all the tests.

I am showing this tutorial in video as well, as I have been doing lately. Here is the link.

I hope you like it.

I’ll start showing the architectural design of what I’m gonna build:

Screen Shot 2020-05-31 at 7.02.22 PM

On the left side we have site 1 e on the right, site 2. They will be connected thru the VPN.

As I am using Virtualbox VMs only, and I don’t have fixed public IPs which would be necessary for a real configuration through the Internet, I will use only private IPs and three Virtualbox Host-Only networks. Following I describe the elements of my architecture:

Vboxnet0 / Vboxnet1 – the internal networks of sites 1 e 2 respectively.

VBoxnet2 – my FAKE Internet, actually a third private network that will connect the two sides of the VPN.

Openswan1 / Openswan2 – the VPN servers of sites 1 e 2 respectively.

Server1 / Server2 – the sample servers of networks 1 and 2 respectively, which will use the tunnel to communicate with each other.

So, my final target is to allow that Server1 e Server2 communicate through their private IPs privados, in different networks, through the VPN tunnel I will create.

Note in the design that I also put a referente to other possible servers, to show that several servers could use the VPN at the same time. For this demo, however, I will only create the servers Server1 e Server2 (besides Openswan1 and Openswan2).

1. Download CentOS

I could had chosen any Linux distro, but I preferred CentOS because it’s Red Hat compatible and it has an option to download only the minimal installation.

From this link I downloaded “CentOS-7-x86_64-Minimal-2003.iso” that I will use to install the system.

2. Configure the Host-Only networks on Virtualbox

As I mentioned before, I will need three Host-Only networks on Virtualbox, to emulate the two separated sites, and my “FAKE Internet”.

So, on Virtualbox go to File => Host Network Manager, and create / configure the following networks:

Screen Shot 2020-05-31 at 7.21.30 PM

Vboxnet0 – IPv4 192.168.56.1 and mask 255.255.255.0.

Vboxnet1 – IPv4 192.168.57.1 and mask 255.255.255.0.

VBoxnet2 – IPv4 192.168.58.1 and mask 255.255.255.0.

Make sure the three networks have DHCP enabled (the image shows the first one):

Screen Shot 2020-05-31 at 7.24.55 PM

3. Create a Virtualbox VM to install Openswan

I guess you are familiar with the process of creating a VM on Virtualbox. The one we are going to create now is very simple, you can only choose its name (Openswan1) and define the type as Linux / Oracle 64-bit.

After the VM creating, we need to configure the network adapters.

As this Openswan1 will be the VPN server on site 1, it will be connected to the networks vboxnet0 and vboxnet2, but not vboxnet1. Later when I create Openswan2, it will be connected to vboxnet1 and vboxnet2, but not vboxnet0. I will have a third adapter as well, using NAT to allow Internet access to download packages.

Click on Settings => Network and set up the three network adapters as following:

Adapter 1: Host-only, attached to vboxnet0:

Screen Shot 2020-05-31 at 7.37.53 PM

Adapter 2: Host-only, attached to vboxnet2:

Screen Shot 2020-05-31 at 7.38.01 PM

Adapter 3: NAT:

Screen Shot 2020-05-31 at 7.38.08 PM

4. Install the Operating System

“Insert” the OS CD in the virtual CD and start the VM for the first time. Then choose “Install CentOS 7”.

From the configuration screen, enable the three network adapters and change hostname to openswan1.localdomain. Click on Begin Installation, and while the system is installing click on Root Password and configure a password for the root user.

5. Configure the network adapters

After the OS installation and boot, connect as root and verify if the three network adapters are online:

Use the following command to see the network cards’ status:

nmcli d

Screen Shot 2020-05-31 at 8.07.25 PM

In my case, only the first adapter was automatically enabled.

To configure them, execute:

nmtui

Select Edit a Connection, choose one adapter at a time and mark the option Automatically Connect:

Screen Shot 2020-05-31 at 8.10.49 PM

After changing all the three cards, they should be enabled and working properly:

Screen Shot 2020-05-31 at 8.12.14 PM

To test, ping the two network gateways and Google DNS:

ping 192.168.56.1
ping 192.168.58.1
ping 8.8.8.8

Screen Shot 2020-05-31 at 8.13.46 PM

6. Create the other VMs from Openswan1 clones

Stop Openswan1 VM and make 3 full clones of it.

Name the new VMs as Openswan2, Server1 e Server2.

7. Adjust the adapters on each new VM

Openswan2: 1 = vboxnet1; 2 = vboxnet2; 3 = NAT

Server1: 1 = vboxnet0; Desligue as placa 2 e 3.

Server2: 1 = vboxnet1; Desligue as placa 2 e 3.

8. Adjust hostnames on each new VM

Execute the following command on each new VM, and give them the names openswan2, server1 e server2. Restart them right after the change.

hostnamectl set-hostname openswan2
reboot

9. Verify the IPs of each VM

Let’s see which IPs each VM received via DHCP:

Screen Shot 2020-05-31 at 8.30.35 PMScreen Shot 2020-05-31 at 8.30.55 PMScreen Shot 2020-05-31 at 8.31.12 PMScreen Shot 2020-05-31 at 8.33.12 PM

As you can see, in my case the IPs are as follow (take note of your numbers because you will need them a lot in the sequence):

Openswan1: 192.168.56.106 and 192.168.58.6

Openswan2: 192.168.57.10 and 192.168.58.7

Server1: 192.168.56.107

Server2: 192.168.57.9

Remember that IPs 192.168.56.x are from vboxnet0 network, IPs 192.168.57.x are from vboxnet1 and IPs 192.168.56.x are from vboxnet2.

I could configure the servers with fixed IPs instead of DHCP, but for our test it doesn’t matter,

10. Test PING among VMs

Now let’s test the communication among the VMs.

If you remember the architecture design, not all VMs can communicate with each other because they are in different networks.

From Openswan1 VM, let’s test ping to Openswan2, and vice-versa. For this we will use the IPs of our “FAKE Internet”, which are those from vboxnet2. The tests should work:

Screen Shot 2020-05-31 at 8.56.27 PMScreen Shot 2020-05-31 at 8.56.19 PM

Now, let’s test ping from Server1 to Openswan1 (with its two IPs), Openswan2 (with its two IPs) e Server2. Only the ping to Openswan1 through the vboxnet0 IP should work. The other tests won’t work because Server1 doesn’t have access to vboxnet1 and vboxnet2.

Screen Shot 2020-05-31 at 9.09.46 PM

Do the same tests from Server2. Similarly, one the ping to Openswan2 using the vboxnet1 IP should work.

11. Install the Openswan software on both VMS

Execute the following command on VMs Openswan1 e Openswan2 to install Openswan:

yum install openswan lsof

12. Adjust the Operating System on Openswan VMs

Enable IP forwarding and disable redirects, by including or changing the following rows in file /etc/sysctl.conf:

net.ipv4.ip_forward = 1
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.all.send_redirects = 0

And to validate the changes, execute:

sysctl -p /etc/sysctl.conf

Turn the firewall on and configure the necessary rules:

systemctl enable firewalld
systemctl start firewalld
systemctl status firewalld
firewall-cmd --zone=public --add-port=500/udp --permanent
firewall-cmd --zone=public --add-port=4500/tcp --permanent
firewall-cmd --zone=public --add-port=4500/udp --permanent
firewall-cmd --permanent --direct --add-rule ipv4 filter INPUT 1 -j ACCEPT
firewall-cmd --permanent --direct --add-rule ipv4 filter FORWARD 1 -j ACCEPT

Now, to execute the next command we need to obtain the name of the network adapter that will be receiving packets from the private networks of each site (vboxnet0 in Openswan1 and vboxnet1 in Openswan2).

On Openswan1 execute the following command and save the adapter name:

ip addr | grep "192.168.56"

Screen Shot 2020-05-31 at 9.31.16 PM

Now do the same on Openswan2:

ip addr | grep "192.168.57"

Screen Shot 2020-05-31 at 9.31.26 PM

In my case, both adapters are enp0s3.

Now, execute on both servers, adjusting the adapter name if necessary:

firewall-cmd --permanent --direct --passthrough ipv4 -t nat -I POSTROUTING -o enp0s3 -j MASQUERADE -s sit_one_subnet/24
systemctl restart firewalld

13. Configure Openswan on both VMs

Once more remembering our architecture, the vboxnet2 network is playing the role of the Internet, so their IPs (192.168.58.x) are the equivalent to the public IPs of the Openswan servers.

The server IPs are:

Openswan1: 192.168.56.106 (private) e 192.168.58.6 (“public”)

Openswan2: 192.168.57.10 (private) e 192.168.58.7 (“public”)

Let’s create the IPSec file /etc/ipsec.conf with the following content.

On Openswan1:

config setup
	protostack=netkey
	nat_traversal=yes
	oe=off
conn vpn1
	authby=secret
	auto=start
	compress=no
	pfs=yes
	type=tunnel
	left=192.168.58.6
	leftsourceip=192.168.56.106
	leftsubnet=192.168.56.0/24
	leftnexthop=%defaultroute
	right=192.168.58.7
	rightsubnet=192.168.57.0/24

On Openswan2:

config setup
	protostack=netkey
	nat_traversal=yes
	oe=off
conn vpn1
	authby=secret
	auto=start
	compress=no
	pfs=yes
	type=tunnel
	leftnexthop=%defaultroute
	left=192.168.58.7
	leftsourceip=192.168.57.10
	leftsubnet=192.168.57.0/24
	right=192.168.58.6
	rightsubnet=192.168.56.0/24

ATTENTION: make sure to keep indentation as it appears here.

The target here is not to explain in details every parameter. Just realize:

Tunnel name: vpn1

left and right: have the “public” IPs of both servers

leftsourceip: the private privado of the server.

leftsubnet e rightsubet: the private subnet exposed through the tunnel on each side.

Note that the parameters “left” are related to the local server, and “right” are relted to the remote server. That’s why they invert on each server configuration file.

14. Configure the secrets file

Let’s put the following content in file /etc/ipsec.secrets on both servers:

192.168.58.6 192.168.58.7: PSK "password"

Note that here we are using our “public” IPs, and the secret key is “password”.

15. Start the IPSec service

systemctl enable ipsec.service
systemctl restart ipsec.service
systemctl status ipsec.service

Screen Shot 2020-05-31 at 10.23.50 PM

If you receive an error message saying the IPSec was not initialized, execute the following commands and the re-execute the previous ones:

ipsec initnss --nssdir /etc/ipsec.d
ipsec newhostkey --output /etc/ipsec.secrets --bits 2192 --verbose --hostname $HOSTNAME

16. Verify if the tunnel is active

ipsec status | grep established

Screen Shot 2020-05-31 at 10.24.48 PM

17. Configure the routes on servers Server1 and Server2

On Server1:

ip route add 192.168.57.0/24 via 192.168.56.106

Screen Shot 2020-05-31 at 10.26.20 PM

On Server2:

ip route add 192.168.56.0/24 via 192.168.57.10

Screen Shot 2020-05-31 at 10.26.29 PM

To make the routes persistent, you could copy the route to /etc/sysconfig/network-scripts/route-enp0s3 (adjusting the adapter name if necessary).

18. Test the ping between servers Server1 e Server2

Now the ping should work, because both sites are connected through the IPSec tunnel!

Let’s do the ping test, and at the same time monitor the TCP packets on servers

Vamos fazer o teste final do ping, e ao mesmo tempo monitorar a chegada de pacotes TCP Openswan1 e Openswan2. For this, SSH to both servers and execute:

yum install tcpdump
tcpdump

Let the command running on both Openswan servers and let’s test the ping from Server1 to Server2:

Screen Shot 2020-05-31 at 10.30.57 PMScreen Shot 2020-05-31 at 10.31.08 PMScreen Shot 2020-05-31 at 10.31.19 PM

Note the ping worked as expected, Veja que o ping funcionou como era esperado, and we can see in tcpdump output the ICMP request and reply packets on both servers, confirming the communication is happening through the IPSec tunnel we’ve just created.

Do the opposite test as well, pinging Server1 from Server2.

19. Test SSH connection between Server1 e Server2

Just to finish, let’s make a real connection between the servers, via SSH.

This time I will start from Server2, trying to SSH to Server1:

Screen Shot 2020-05-31 at 10.38.02 PM

And that’s it! It’s a long post, but it was worth it, at least for me ;-).

In my next post I will demonstrate how to connect a VPC AWS, using an Openswan server like I used here, to emulate on-premise environment, and Oracle OCI IPSec VPN service.

See you.

 

VPN IPSec Site-to-site com Virtualbox e Openswan (com video)

Português/English

Neste post eu vou demostrar todos os passos para construir um túnel VPN site-to-site, usando redes e VMs no Virtualbox, e o software Openswan.

Meu objetivo ao criar e testar este procedimento foi aprender um pouco sobre o conceito e configuração de VPN site-to-site, para depois aplicar numa configuração de VPN site-to-site na Oracle Cloud Infrastructure (OCI). Meu próximo post será justamente sobre esta configuração com OCI.

Portanto, eu vou neste post fazer todo o processo, desde a instalação do Sistema Operacional, instalação e configuração do Openswan, configuração do túnel IPSec e testes.

Estou também fazendo este tutorial em vídeo, como tenho feito nos meus últimos posts. Aqui está o link.

Espero que gostem.

Vou começar mostrando o desenho da arquitetura que eu vou montar:

Screen Shot 2020-05-31 at 7.02.22 PM

Do lado esquerdo temos o site 1 e do lado direito o site 2, que serão conectados através da VPN.

Como estou usando somente VMs Virtualbox, e não tenho IPs públicos fixos que seriam necessários para uma configuração real através da Internet, eu vou utilizar somente IPs privados e três redes Host-Only no Virtualbox. Abaixo descrevo os elementos da minha arquitetura:

Vboxnet0 / Vboxnet1 – são as redes internas dos sites 1 e 2 respectivamente.

VBoxnet2 – é a minha Internet FAKE, na verdade uma terceira rede privada que vai conectar as duas pontas da VPN.

Openswan1 / Openswan2 – são os servidores de VPN dos sites 1 e 2 respectivamente.

Server1 / Server2 – são os servidores exemplo da rede 1 e rede 2 respectivamente, que utilizarão o túnel a ser criado para se comunicar.

Ou seja, o objetivo final é permitir que os servidores Server1 e Server2 se comuniquem através de seus IPs privados, em redes diferentes, através do túnel VPN que será criado.

Veja que no desenho eu também coloque em cada site uma referência a outros possíveis servidores, para demonstrar que vários servidores em cada rede poderiam se utilizar da VPN ao mesmo tempo. Para efeito desta demo, só vou criar os servidores Server1 e Server2 (além dos Openswan1 e Openswan2).

1. Baixe o CentOS

Eu poderia ter escolhido qualquer outra distribuição Linux, mas preferi o CentOS que é compatível com o Red Hat e tem uma opção de download “mínima”. Como não vou precisar de muita coisa instalada no Sistema Operacional, não faria sentido baixar um “iso” enorme.

A partir deste link eu baixei o “CentOS-7-x86_64-Minimal-2003.iso” que vou utilizar para instalar o Sistema Operacional.

2. Configure as redes Host-Only do seu Virtualbox

Como eu comentei acima no desenho da arquitetura, vou precisar de três redes Host-Only do Virtualbox, com o objetivo de simular dois sites separados, que não se comunicam entre si, e a minha “Internet FAKE”.

Portanto, no seu Virtualbox vá em File => Host Network Manager, e crie / configure as três redes que vamos usar, conforme abaixo:

Screen Shot 2020-05-31 at 7.21.30 PM

Vboxnet0 – IPv4 192.168.56.1 e máscara 255.255.255.0.

Vboxnet1 – IPv4 192.168.57.1 e máscara 255.255.255.0.

VBoxnet2 – IPv4 192.168.58.1 e máscara 255.255.255.0.

Tenha certeza de que as três redes estão com DHCP ligado (aqui eu mostro a primeira):

Screen Shot 2020-05-31 at 7.24.55 PM

3. Crie uma VM Virtualbox para instalar o Openswan

Imagino que você já esteja familiarizado com o processo de criação de VMs no Virtualbox. A VM que vamos criar será bastante simples, então você pode apenas dar o nome a ela (Openswan1) e definir o tipo como Linux / Oracle 64-bit.

Após a criação da VM, precisamos configurar as placas de rede.

Como este Openswan1 será o servidor de VPN do site 1, ele estará ligado às redes vboxnet0 e vboxnet2, mas não à vboxnet1. Mais à frente quando eu criar o Openswan2, ele estará ligado às redes vboxnet1 e vboxnet2, mas não à vboxnet0. Também terei uma terceira placa de rede usando NAT, para acessar à Internet para baixar pacotes.

Clique em Settings => Network e configure as 3 placas de rede conforme a seguir:

Adapter 1: Host-only, ligada à rede vboxnet0:

Screen Shot 2020-05-31 at 7.37.53 PM

Adapter 2: Host-only, ligada à rede vboxnet2:

Screen Shot 2020-05-31 at 7.38.01 PM

Adapter 3: NAT:

Screen Shot 2020-05-31 at 7.38.08 PM

4. Instale o Sistema Operacional

“Insira” o CD do Sistema Operacional no leitor de CD virtual e inicie a VM pela primeira vez. Em seguida escolha a opção “Install CentOS 7”.

A partir da tela de configuração da instalação, ligue as três placas de rede e mude o hostname para openswan1.localdomain. Clique em Begin Installation, e enquanto o sistema estiver instalando, clique em Root Password e configure uma senha para o usuário root.

5. Configure as placas de rede

Após terminar a instalação e reiniciar o sistema, conecte-se como root e verifique se as três placas de rede estão “on” e funcionando.

Utilize o comando abaixo para ver quais placas estão no ar:

nmcli d

Screen Shot 2020-05-31 at 8.07.25 PM

Veja que no meu caso, somente uma das placas estava ligada após a instalação.

Para configurar as placas, execute:

nmtui

Selecione Edit a Connection, selecione cada uma das placas e marque a opção Automatically Connect:

Screen Shot 2020-05-31 at 8.10.49 PM

Após fazer a alteração para as três placas, elas devem estar ligadas.

Screen Shot 2020-05-31 at 8.12.14 PM

Para testar, pingue os gateways das duas redes e o DNS do Google:

ping 192.168.56.1
ping 192.168.58.1
ping 8.8.8.8

Screen Shot 2020-05-31 at 8.13.46 PM

6. Crie as outras VM como clones da Openswan1

Finalize a VM Openswan1 e faça 3 clones full dela.

Nomeie as novas VMs como Openswan2, Server1 e Server2.

7. Ajuste as placas de rede de cada uma das novas VMs

Openswan2: 1 = vboxnet1; 2 = vboxnet2; 3 = NAT

Server1: 1 = vboxnet0; Desligue as placa 2 e 3.

Server2: 1 = vboxnet1; Desligue as placa 2 e 3.

8. Ajuste os hostnames das 3 novas VMs

Em cada uma delas execute o comando abaixo e dê os nomes openswan2, server1 e server2. Reinicie as VMs após a troca do hostname.

hostnamectl set-hostname openswan2
reboot

9. Verifique os IPs de cada uma das VMs

Vamos verificar quais os IPs que cada uma das VMs recebeu via DHCP:

Screen Shot 2020-05-31 at 8.30.35 PMScreen Shot 2020-05-31 at 8.30.55 PMScreen Shot 2020-05-31 at 8.31.12 PMScreen Shot 2020-05-31 at 8.33.12 PM

Como podem ver acima, no meu caso os IPs associados foram os seguintes (anote os seus números pois terá que usá-los bastante a seguir):

Openswan1: 192.168.56.106 e 192.168.58.6

Openswan2: 192.168.57.10 e 192.168.58.7

Server1: 192.168.56.107

Server2: 192.168.57.9

Vale lembrar que os IPs 192.168.56.x são da rede vboxnet0, os os IPs 192.168.57.x são da rede vboxnet1 e os IPs 192.168.56.x são da rede vboxnet2.

Eu também poderia configurar os servidores com IPs fixos, e não recebidos via DHCP, mas para o nosso teste não vai fazer diferença,

10. Faça testes de PING entre as VMs

Agora vamos testar a comunicação entre as VMs.

Se você se lembra do desenho da arquitetura, nem todas as VMs podem se comunicar por estarem em redes diferentes.

A partir da VM Openswan1, vamos testar o ping para a VM Openswan2, e vice-versa. Para isto vamos usar os IPs da nossa “Internet FAKE”, que são os da vboxnet2. Os testes devem funcionar:

Screen Shot 2020-05-31 at 8.56.27 PMScreen Shot 2020-05-31 at 8.56.19 PM

Agora, vamos testar pings a partir do Server1 para o Openswan1 (com seus dois IPs), Openswan2 (com seus dois IPs) e Server2. Somente o ping para o Openswan1 pelo IP da rede vboxnet0 deve funcionar. Os outros testes não devem funcionar porque o Server1 não tem acesso às redes vboxnet1 e vboxnet2.

Screen Shot 2020-05-31 at 9.09.46 PM

Faça os mesmos testes a partir do Server2. Semelhantemente, apenas o ping para o Openswan2 pelo IP da rede vboxnet1 deve funcionar.

11. Instale o software Openswan nas duas VMs

Execute o seguinte comando nas VMs Openswan1 e Openswan2 para instalar o software Openswan:

yum install openswan lsof

12. Ajuste o Sistema Operacional das VMs Openswan

Habilite o IP forwarding e desligue redirecionamentos, colocando as seguintes linhas no arquivo /etc/sysctl.conf:

net.ipv4.ip_forward = 1
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.all.send_redirects = 0

E para validar as alterações, execute:

sysctl -p /etc/sysctl.conf

Ligue o firewall e configure as regras para liberar as portas necessárias:

systemctl enable firewalld
systemctl start firewalld
systemctl status firewalld
firewall-cmd --zone=public --add-port=500/udp --permanent
firewall-cmd --zone=public --add-port=4500/tcp --permanent
firewall-cmd --zone=public --add-port=4500/udp --permanent
firewall-cmd --permanent --direct --add-rule ipv4 filter INPUT 1 -j ACCEPT
firewall-cmd --permanent --direct --add-rule ipv4 filter FORWARD 1 -j ACCEPT

Agora, para executar o próximo comando precisamos obter o nome da placa de rede que estará recebendo os pacotes das redes privadas de cada site (vboxnet0 no Openswan1 e vboxnet1 no Openswan2).

No Openswan1 execute o comando abaixo e salve o nome da placa de rede:

ip addr | grep "192.168.56"

Screen Shot 2020-05-31 at 9.31.16 PM

Agora faça no Openswan2:

ip addr | grep "192.168.57"

Screen Shot 2020-05-31 at 9.31.26 PM

No meu caso, as duas placas são enp0s3.

Agora, execute nos dois servidores, tomando o cuidado de ajustar o nome da placa de rede:

firewall-cmd --permanent --direct --passthrough ipv4 -t nat -I POSTROUTING -o enp0s3 -j MASQUERADE -s sit_one_subnet/24
systemctl restart firewalld

13. Configure o Openswan nas duas VMs

Mais uma vez relembrando a nossa arquitetura, a rede vboxnet2 está fazendo o papel da Internet, isto é, os IPs desta rede (192.168.58.x) são o equivalente ao IP público do servidor Openswan.

Os IPs dos servidores são:

Openswan1: 192.168.56.106 (privado) e 192.168.58.6 (“público”)

Openswan2: 192.168.57.10 (privado) e 192.168.58.7 (“público”)

Vamos editar o arquivo de configuração do IPSec /etc/ipsec.conf e colocar nele o conteúdo abaixo.

No servidor Openswan1:

config setup
	protostack=netkey
	nat_traversal=yes
	oe=off
conn vpn1
	authby=secret
	auto=start
	compress=no
	pfs=yes
	type=tunnel
	left=192.168.58.6
	leftsourceip=192.168.56.106
	leftsubnet=192.168.56.0/24
	leftnexthop=%defaultroute
	right=192.168.58.7
	rightsubnet=192.168.57.0/24

E no servidor Openswan2:

config setup
	protostack=netkey
	nat_traversal=yes
	oe=off
conn vpn1
	authby=secret
	auto=start
	compress=no
	pfs=yes
	type=tunnel
	leftnexthop=%defaultroute
	left=192.168.58.7
	leftsourceip=192.168.57.10
	leftsubnet=192.168.57.0/24
	right=192.168.58.6
	rightsubnet=192.168.56.0/24

ATENÇÃO: tenha certeza de manter a indentação conforme aparece acima.

O objetivo aqui não é explicar em detalhes cada parâmetro. Apenas perceba:

Nome do túnel sendo criado: vpn1

left e right: contém os IPs “públicos” dos dois servidores

leftsourceip: o IP privado da rede naquele servidor.

leftsubnet e rightsubet: as redes privadas sendo expostas pelo túnel em cada lado.

Perceba que o parâmetros “left” estão relacionados com o servidor local, e os parâmetros “right” estão relacionados com o servidor remoto. Por isso estes parâmetros são invertidos em cada servidor.

14. Configure o arquivo secrets

Vamos colocar o seguinte conteúdo no arquivo /etc/ipsec.secrets nos dois servidores:

192.168.58.6 192.168.58.7: PSK "password"

Veja que neste arquivo colocamos os IPs “públicos”, e a chave secreta de conexão “password”.

15. Inicie o serviço IPSec

systemctl enable ipsec.service
systemctl restart ipsec.service
systemctl status ipsec.service

Screen Shot 2020-05-31 at 10.23.50 PM

Se você receber uma mensagem de que o serviço de IPSec não foi iniciado, execute os seguintes comandos e depois os acima novamente.

ipsec initnss --nssdir /etc/ipsec.d
ipsec newhostkey --output /etc/ipsec.secrets --bits 2192 --verbose --hostname $HOSTNAME

16. Verifique se o túnel está no ar

ipsec status | grep established

Screen Shot 2020-05-31 at 10.24.48 PM

17. Configure as rotas nos servidores Server1 e Server2

No Server1:

ip route add 192.168.57.0/24 via 192.168.56.106

Screen Shot 2020-05-31 at 10.26.20 PM

No Server2:

ip route add 192.168.56.0/24 via 192.168.57.10

Screen Shot 2020-05-31 at 10.26.29 PM

Para fazer as rotas persistentes, você pode copiar a rota para dentro do arquivo /etc/sysconfig/network-scripts/route-enp0s3 (ajustando o nome da placa de rede).

18. Teste o ping entre os servidores Server1 e Server2

Agora o ping deve funcionar, porque os dois sites estão conectados através do túnel IPSec!

Vamos fazer o teste final do ping, e ao mesmo tempo monitorar a chegada de pacotes TCP nos servidores Openswan1 e Openswan2. Para isso, abra sessões SSH nos dois servidores e execute:

yum install tcpdump
tcpdump

Deixe o comando rodando nas VMs Openswan1 e Openswan2, e vamos testar um ping do Server1 para o Server2:

Screen Shot 2020-05-31 at 10.30.57 PMScreen Shot 2020-05-31 at 10.31.08 PMScreen Shot 2020-05-31 at 10.31.19 PM

Veja que o ping funcionou como era esperado, e podemos ver o tcpdump os pacotes de ICMP request e reply nos dois servidores, comprovando que a comunicação está ocorrendo através do túnel IPSec que criamos.

Faça também o teste contrário, pingando o Server1 a partir do Server2.

19. Faça teste de conexão SSH entre os servidores Server1 e Server2

Somente para terminar, vamos fazer uma conexão real entre os servidores, via SSH.

Desta vez eu vou iniciar a conexão pelo Server2, tentando SSH para o Server1:

Screen Shot 2020-05-31 at 10.38.02 PM

E é isso pessoal, espero que tenham curtido. O post é longo, mas para mim value a experiência!

No meu próximo post vou demonstrar como fazer a conexão entre uma VPC na cloud AWS, usando uma instância com Openswan do mesmo modo que usei aqui, para simular o ambiente On-premise, e o serviço de IPSec VPN da cloud Oracle OCI.

Até lá.

Autoscaling in OCI

 

Português/English

In this video I show the complete process of Autoscaling in OCI, which allows you to scale out or scale in the instances in an Instance Pool according to pre-configured metrics. I go over all the steps, since the creation of VCN, Subnets, Route Tables, Security Lists, as well as Instances and Load Balancers and finally creating Instance Configuration, Instance Pool and Autoscaling and putting all to work together.

Obs: here is the script I’ve used to automatically create the Web Server:

#!/bin/bash -x
#yum update -y
yum install -y httpd
yum install -y stress-ng
systemctl enable httpd.service
systemctl restart httpd.service
echo '<font size="+6">' > /var/www/html/index.html
hostname >> /var/www/html/index.html
date >> /var/www/html/index.html
echo '</font>' >> /var/www/html/index.html
firewall-offline-cmd --add-service=http
systemctl enable firewalld
systemctl restart firewalld

Video in Portuguese, o help Oracle community in Brazil, ok?

Feedbacks and suggestions for new videos are always welcome.

I hope you like it.

Autoscaling na OCI

Português/English

Neste vídeo eu demonstro todo o processo de Autoscaling no OCI, em que você pode crescer ou diminuir a quantidade de instâncias no Instance Pool de acordo com métricas pré-configuradas. Eu passo por todos os passos, desde a criação de VCN, Subnets, Route Tables, Security Lists, passando por criação de Instances e Load Balancers e finalmente criando a Instance Configuration, Instance Pool e Autoscaling e colocando tudo para funcionar. Obs: aqui está o script que eu usei para criar automaticamente o Web Server:

#!/bin/bash -x
#yum update -y
yum install -y httpd
yum install -y stress-ng
systemctl enable httpd.service
systemctl restart httpd.service
echo '<font size="+6">' > /var/www/html/index.html
hostname >> /var/www/html/index.html
date >> /var/www/html/index.html
echo '</font>' >> /var/www/html/index.html
firewall-offline-cmd --add-service=http
systemctl enable firewalld
systemctl restart firewalld

Vídeo em português, para ajudar à comunidade Oracle no Brasil, ok?

Feedbacks e sugestões de novos vídeos são sempre bem-vindos.

Espero que gostem.

Moving Compartments in OCI

 

Português/English

In this video I explain how to move a compartment in OCI and, more importantly, what happens to the policies that refer the moved compartment. When a policy is automatically adjusted to reflect the change, and when you need to create a new policy because the original is not adjusted.

Video in Portuguese, o help Oracle community in Brazil, ok?

Feedbacks and suggestions for new videos are always welcome.

I hope you like it.

Movendo Compartimentos na OCI

Português/English

Neste vídeo eu explico como mover um compartimento na OCI e, mais importante, o que acontece com as policies que se referem ao compartimento movido. Quando uma policy é atualizada automaticamente para refletir a mudança de local do compartimento, e quando você precisa criar uma nova policy porque a policy original não é atualizada.

Vídeo em português, para ajudar à comunidade Oracle no Brasil, ok?

Feedbacks e sugestões de novos vídeos são sempre bem-vindos.

Espero que gostem.

Route Tables, Security Lists and Network Security Groups in OCI

 

Português/English

In this second video I post about OCI, I explain the differences among Route Tables, Security Lists and Network Security Groups and show how to use them to allow access to the servers inside your VCN.

Once more, the video is in Portuguese, as it is mainly focused to help the Oracle community in Brazil.

EDIT: I’ve added English subtitles to the video, so if you don’t speak Portuguese, you can follow the video now ;-).

Your feedbacks and suggestions for other themes are welcome.

Here is the link.

I hope you enjoy.

Route Tables, Security Lists e Network Security Groups na OCI

Português/English

Neste segundo vídeo que publico sobre OCI, eu explico a diferença entre Route Tables, Security Lists e Network Security Groups e demonstro como utilizá-las para liberar os acessos necessários para os servidores dentro da sua VCN.

Mais uma vez, criei o vídeo falando em português, para ajudar à comunidade Oracle no Brasil.

EDIT: Eu adicionei legendas em inglês ao vídeo, então se você não fala português agora também pode acompanhar o conteúdo ;-).

Aguardo seus feedbacks e também sugestões de temas para outros vídeos sobre banco de dados Oracle e/ou OCI, por favor me avise.

Aqui está o link.

Espero que gostem.

How to configure Remote Peering in OCI

 

Português/English

In this post I am sharing a video I produced on How to configure Remote Peering in Oracle Cloud Infrastructure (OCI).

If you don’t know, Remote peering is required when you have two VCNs (OCI virtual networks) in different regions, and want the resources (usually compute instances) in these networks to communicate with each other.

As there is already plenty of similar resources in English, I created the video speaking in Portuguese, to help the Oracle community in Brazil.

EDIT: I’ve added English subtitles to the video, so if you don’t speak Portuguese, you can follow the video now ;-).

I plan to post more videos like this, as I have time and choose the themes. If you would like to see me saying about a specific subject related to Oracle database and/or OCI, please let me know.

Here is the link.

I hope you enjoy.

Como configurar Remote Peering na OCI

Português/English

Nest post vou compartilhar um vídeo que produzi sobre como configurar Remote Peering na Oracle Cloud Infrastructure (OCI).

Caso você não sabia, Remote peering é necessário quando você tem duas VCNs (rede virtual da OCI) em regiões diferentes, e quer que os recursos (normalmente Compute Instances) nestas duas redes se comuniquem entre si.

Como já existem muitos recursos semelhantes em inglês, eu criei o vídeo falando em português, para ajudar à comunidade Oracle no Brasil.

EDIT: Eu adicionei legendas em inglês ao vídeo, então se você não fala português agora também pode acompanhar o conteúdo ;-).

Eu pretendo criar mais vídeos como este, conforme tenha tempo e escolha os temas. Se você quiser que eu fale sobre algum tema específico relacionado a banco de dados Oracle e/ou OCI, por favor me avise.

Aqui está o link.

Espero que gostem.

How to move an Oracle database to another host WITHOUT the need to open it with RESETLOGS

Português/English

It’s well known the technique of moving an Oracle database to a new host, as explained for example in this documentation link. It’s also well known the technique to keep updating the target with all the changes in the source until the cutover window.

Basically, the steps are the following:

  1. Prepare the new host (install Oracle software, etc).
  2. Copy all Oracle configuration files (spfile, password file, listener.ora, tnsnames.ora, etc) and make the necessary adjustments.
  3. Perform a full backup of the source database.
  4. Copy the backup files to the target and restore the database from it.
  5. While you are waiting for the cutover window, perform periodic incremental backups in the source and update the target with them.
  6. In the cutover window, perform a last incremental backup of the source with the database mounted, then copy and apply it to the target.
  7. Open the database with RESETLOGS option.

The main problem with this approach is the fact that you end the process by opening the target database with RESETLOGS, which make it to be a new incarnation, almost as if it was a new database. In practice you need to start your backup history from scratch, and if something goes wrong in the target you don’t have the option to revert to the old database, unless you accept to loose all the changes made in the target after the database was open.

If you are using Oracle Enterprise Edition, you can configure a DataGuard between both databases, so you can switch to the new server almost without any downtime, and also revert to the source without any data loss at any moment if something goes wrong.

But, what if you don’t have Oracle EE license, or for any other reason you don’t want to use DataGuard? Is there a way to open the database without the RESETLOGS option and also revert to the source if necessary?

I found very little documentation about this, so this post is to explain why usually a RESETLOGS would be necessary, and how to overcome this constraint.

First of all, some information about my environment:

  • I am using two Oracle Linux virtual servers in AWS with Oracle 12.2 installed.
  • The servers are named ora12c-1 (the source) and ora12c-2 (the target). The Linux prompt identifies which server I am working on.
  • In the source server I have only the database orcldb that will be used here. In the target there is no database.
  • Both servers can communicate through the network, so I can easily transfer files between them.
  • All the environment variables are configured properly for the oracle OS user. So whenever I connect to a database it will always be the right one.
  • The database files, FRA, backups and everything else will be copied to the same paths in the target.

So let’s get started.

1. Prepare the source database

I just want to check if the database is in ARCHIVELOG mode so I can execute all the backups online:

[oracle@ora12c-1 ~]$ sqlplus / as sysdba

SQL*Plus: Release 12.2.0.1.0 Production on Mon Mar 25 11:33:13 2019

Copyright (c) 1982, 2016, Oracle.  All rights reserved.

Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

SQL> select name, log_mode from v$database;

NAME	  LOG_MODE
--------- ------------
ORCLCDB   ARCHIVELOG

SQL> exit
Disconnected from Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

2. Prepare the target server

Here I will just create the directories that will be used by the target database, the same way they exist in the source:

[oracle@ora12c-2 ~]$ mkdir -p $ORACLE_BASE/admin/orclcdb/adump
[oracle@ora12c-2 ~]$ mkdir -p $ORACLE_BASE/fast_recovery_area/orclcdb/ORCLCDB

3. Copy the SPFILE, password file and network configuration files to the target and adjust them

Let’s copy all the Oracle configuration files to the target:

[oracle@ora12c-1 ~]$ scp -i ~/EC2us-east-1.pem $ORACLE_HOME/dbs/spfileorclcdb.ora oracle@ora12c-2:$ORACLE_HOME/dbs/
spfileorclcdb.ora 100% 3584 4.9MB/s 00:00 
[oracle@ora12c-1 ~]$ scp -i ~/EC2us-east-1.pem $ORACLE_HOME/dbs/orapworclcdb oracle@ora12c-2:$ORACLE_HOME/dbs/
orapworclcdb 100% 3584 2.9MB/s 00:00 
[oracle@ora12c-1 ~]$ scp -i ~/EC2us-east-1.pem $ORACLE_HOME/network/admin/*ora oracle@ora12c-2:$ORACLE_HOME/network/admin/
listener.ora 100% 336 264.6KB/s 00:00 
sqlnet.ora 100% 202 364.2KB/s 00:00 
tnsnames.ora 100% 494 528.6KB/s 00:00

After copying these files, you should review the content of the network files and make the necessary adjustments considering the host server is different.

4. Perform a full backup of the source database

The first backup of the source database needs to be a full backup. Actually, it must be an incremental Level 0 backup in order to allow the subsequent Level 1 to consider it.

For the method we are going to use, there is no need to backup the Archived Logs, because the target database will be updated only with incremental backups.

[oracle@ora12c-1 ~]$ rman target /

Recovery Manager: Release 12.2.0.1.0 - Production on Mon Mar 25 12:17:15 2019

Copyright (c) 1982, 2017, Oracle and/or its affiliates. All rights reserved.

connected to target database: ORCLCDB (DBID=2774882222, not open)

RMAN> backup incremental level 0 as compressed backupset database;

Starting backup at 25-MAR-19
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=24 device type=DISK
channel ORA_DISK_1: starting compressed incremental level 0 datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00001 name=/u01/app/oracle/oradata/orclcdb/system01.dbf
input datafile file number=00003 name=/u01/app/oracle/oradata/orclcdb/sysaux01.dbf
input datafile file number=00004 name=/u01/app/oracle/oradata/orclcdb/undotbs01.dbf
input datafile file number=00007 name=/u01/app/oracle/oradata/orclcdb/users01.dbf
channel ORA_DISK_1: starting piece 1 at 25-MAR-19
channel ORA_DISK_1: finished piece 1 at 25-MAR-19
piece handle=/u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/backupset/2019_03_25/o1_mf_nnnd0_TAG20190325T121727_g9kzrqyb_.bkp tag=TAG20190325T121727 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:45
channel ORA_DISK_1: starting compressed incremental level 0 datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00010 name=/u01/app/oracle/oradata/orclcdb/pdb1/sysaux01.dbf
input datafile file number=00009 name=/u01/app/oracle/oradata/orclcdb/pdb1/system01.dbf
input datafile file number=00011 name=/u01/app/oracle/oradata/orclcdb/pdb1/users01.dbf
channel ORA_DISK_1: starting piece 1 at 25-MAR-19
channel ORA_DISK_1: finished piece 1 at 25-MAR-19
piece handle=/u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/84EA94D2E64025B8E053AB5C1FAC5F38/backupset/2019_03_25/o1_mf_nnnd0_TAG20190325T121727_g9kzt54y_.bkp tag=TAG20190325T121727 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:25
channel ORA_DISK_1: starting compressed incremental level 0 datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00006 name=/u01/app/oracle/oradata/orclcdb/pdbseed/sysaux01.dbf
input datafile file number=00005 name=/u01/app/oracle/oradata/orclcdb/pdbseed/system01.dbf
channel ORA_DISK_1: starting piece 1 at 25-MAR-19
channel ORA_DISK_1: finished piece 1 at 25-MAR-19
piece handle=/u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/84EA3D138B311A47E053AB5C1FACC44E/backupset/2019_03_25/o1_mf_nnnd0_TAG20190325T121727_g9kzty8l_.bkp tag=TAG20190325T121727 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:25
Finished backup at 25-MAR-19

Starting Control File and SPFILE Autobackup at 25-MAR-19
piece handle=/u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/autobackup/2019_03_25/o1_mf_s_1003825815_g9kzvqgf_.bkp comment=NONE
Finished Control File and SPFILE Autobackup at 25-MAR-19

RMAN> exit

Recovery Manager complete.

I will take note of the controlfile autobackup to restore from it in the target: /u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/autobackup/2019_03_25/o1_mf_s_1003825815_g9kzvqgf_.bkp .

5. Copy the backup files to the target

As I will need to copy backup files from the source to the target several times, I will keep the FRA of both servers in sync by using rsync:

[oracle@ora12c-1 ~]$ rsync --progress -avz -e "ssh -i ~/EC2us-east-1.pem" /u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/ oracle@ora12c-2:/u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/
sending incremental file list
./
84EA3D138B311A47E053AB5C1FACC44E/
84EA3D138B311A47E053AB5C1FACC44E/backupset/
84EA3D138B311A47E053AB5C1FACC44E/backupset/2019_03_25/
84EA3D138B311A47E053AB5C1FACC44E/backupset/2019_03_25/o1_mf_nnnd0_TAG20190325T121727_g9kzty8l_.bkp
    164,388,864 100%   20.01MB/s    0:00:07 (xfr#1, to-chk=8/18)
84EA94D2E64025B8E053AB5C1FAC5F38/
84EA94D2E64025B8E053AB5C1FAC5F38/backupset/
84EA94D2E64025B8E053AB5C1FAC5F38/backupset/2019_03_25/
84EA94D2E64025B8E053AB5C1FAC5F38/backupset/2019_03_25/o1_mf_nnnd0_TAG20190325T121727_g9kzt54y_.bkp
    164,503,552 100%   18.03MB/s    0:00:08 (xfr#2, to-chk=5/18)
archivelog/
archivelog/2019_03_25/
autobackup/
autobackup/2019_03_25/
autobackup/2019_03_25/o1_mf_s_1003825815_g9kzvqgf_.bkp
     18,825,216 100%   20.31MB/s    0:00:00 (xfr#3, to-chk=2/18)
backupset/
backupset/2019_03_25/
backupset/2019_03_25/o1_mf_nnnd0_TAG20190325T121727_g9kzrqyb_.bkp
    327,761,920 100%   18.43MB/s    0:00:16 (xfr#4, to-chk=0/18)
onlinelog/

sent 627,374,884 bytes  received 163 bytes  19,303,847.60 bytes/sec
total size is 675,479,552  speedup is 1.08

6. Unset OMF in the target

If you are using Oracle Managed Files, I would recommend you to temporarily unset it in the target. If OMF is set, when you perform a restore RMAN will create the new data files with a different suffix. Then we would need to execute SET NEWNAME statements in order for the control file to recognise this new data files. As the control file will be recopied in the cutover, these statements would need to be reissued thus increasing the difficulty and the probability of errors.

[oracle@ora12c-2 ~]$ echo $ORACLE_SID
orclcdb
[oracle@ora12c-2 ~]$ sqlplus / as sysdba

SQL*Plus: Release 12.2.0.1.0 Production on Mon Mar 25 12:45:37 2019

Copyright (c) 1982, 2016, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup nomount
ORACLE instance started.

Total System Global Area  419430400 bytes
Fixed Size		    8793496 bytes
Variable Size		  297796200 bytes
Database Buffers	  109051904 bytes
Redo Buffers		    3788800 bytes
SQL> alter system set db_create_file_dest='' scope=both;

System altered.

SQL> exit
Disconnected from Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

7. Restore the control file and mount the database

Let’s restore the control file from the autobackup taken in the source, and then mount the database:

[oracle@ora12c-2 ~]$ rman target /

Recovery Manager: Release 12.2.0.1.0 - Production on Mon Mar 25 12:48:59 2019

Copyright (c) 1982, 2017, Oracle and/or its affiliates.  All rights reserved.

connected to target database: ORCLCDB (not mounted)

RMAN> restore controlfile from '/u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/autobackup/2019_03_25/o1_mf_s_1003825815_g9kzvqgf_.bkp';

Starting restore at 25-MAR-19
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=35 device type=DISK

channel ORA_DISK_1: restoring control file
channel ORA_DISK_1: restore complete, elapsed time: 00:00:02
output file name=/u01/app/oracle/oradata/orclcdb/control01.ctl
output file name=/u01/app/oracle/fast_recovery_area/orclcdb/control02.ctl
Finished restore at 25-MAR-19

RMAN> alter database mount;

Statement processed
released channel: ORA_DISK_1

9. Restore the database

Now it’s time to restore the database, i.e., the data files. Please note that, as the control file restored was backed up after the data files backup sets were created, it has the information about these backup sets and there is no need to catalog anything.

RMAN> restore database;

Starting restore at 25-MAR-19
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=1 device type=DISK

channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00001 to /u01/app/oracle/oradata/orclcdb/system01.dbf
channel ORA_DISK_1: restoring datafile 00003 to /u01/app/oracle/oradata/orclcdb/sysaux01.dbf
channel ORA_DISK_1: restoring datafile 00004 to /u01/app/oracle/oradata/orclcdb/undotbs01.dbf
channel ORA_DISK_1: restoring datafile 00007 to /u01/app/oracle/oradata/orclcdb/users01.dbf
channel ORA_DISK_1: reading from backup piece /u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/backupset/2019_03_25/o1_mf_nnnd0_TAG20190325T121727_g9kzrqyb_.bkp
channel ORA_DISK_1: piece handle=/u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/backupset/2019_03_25/o1_mf_nnnd0_TAG20190325T121727_g9kzrqyb_.bkp tag=TAG20190325T121727
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:55
channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00009 to /u01/app/oracle/oradata/orclcdb/pdb1/system01.dbf
channel ORA_DISK_1: restoring datafile 00010 to /u01/app/oracle/oradata/orclcdb/pdb1/sysaux01.dbf
channel ORA_DISK_1: restoring datafile 00011 to /u01/app/oracle/oradata/orclcdb/pdb1/users01.dbf
channel ORA_DISK_1: reading from backup piece /u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/84EA94D2E64025B8E053AB5C1FAC5F38/backupset/2019_03_25/o1_mf_nnnd0_TAG20190325T121727_g9kzt54y_.bkp
channel ORA_DISK_1: piece handle=/u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/84EA94D2E64025B8E053AB5C1FAC5F38/backupset/2019_03_25/o1_mf_nnnd0_TAG20190325T121727_g9kzt54y_.bkp tag=TAG20190325T121727
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:25
channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00005 to /u01/app/oracle/oradata/orclcdb/pdbseed/system01.dbf
channel ORA_DISK_1: restoring datafile 00006 to /u01/app/oracle/oradata/orclcdb/pdbseed/sysaux01.dbf
channel ORA_DISK_1: reading from backup piece /u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/84EA3D138B311A47E053AB5C1FACC44E/backupset/2019_03_25/o1_mf_nnnd0_TAG20190325T121727_g9kzty8l_.bkp
channel ORA_DISK_1: piece handle=/u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/84EA3D138B311A47E053AB5C1FACC44E/backupset/2019_03_25/o1_mf_nnnd0_TAG20190325T121727_g9kzty8l_.bkp tag=TAG20190325T121727
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:25
Finished restore at 25-MAR-19

RMAN> exit

Recovery Manager complete.

At this point, we have the target database in mount state and updated with all the changes until the moment of the backup.

10. Keep updating the target database with incremental backups

As the source database is still being used and changed over time, it is necessary to apply all these changes to the target. We will do this with periodic incremental backups. They could be executed weekly, daily or even more often, depending on the volume of changes.

For this example, I will simulate two cycles of changes in the database and incremental backups that will contain that changes.

I will create a table TABLE1 to mimic changes in the source database.

[oracle@ora12c-1 ~]$ sqlplus / as sysdba

SQL*Plus: Release 12.2.0.1.0 Production on Mon Mar 25 13:06:35 2019

Copyright (c) 1982, 2016, Oracle.  All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

SQL> create table TABLE1 as select * from dba_objects;

Table created.

SQL> exit
Disconnected from Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

Then I create an incremental level 1 backup.

[oracle@ora12c-1 ~]$ rman target /

Recovery Manager: Release 12.2.0.1.0 - Production on Mon Mar 25 13:09:06 2019

Copyright (c) 1982, 2017, Oracle and/or its affiliates. All rights reserved.

connected to target database: ORCLCDB (DBID=2774882222)

RMAN> backup incremental level 1 as compressed backupset database;

Starting backup at 25-MAR-19
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=68 device type=DISK
channel ORA_DISK_1: starting compressed incremental level 1 datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00001 name=/u01/app/oracle/oradata/orclcdb/system01.dbf
input datafile file number=00003 name=/u01/app/oracle/oradata/orclcdb/sysaux01.dbf
input datafile file number=00004 name=/u01/app/oracle/oradata/orclcdb/undotbs01.dbf
input datafile file number=00007 name=/u01/app/oracle/oradata/orclcdb/users01.dbf
channel ORA_DISK_1: starting piece 1 at 25-MAR-19
channel ORA_DISK_1: finished piece 1 at 25-MAR-19
piece handle=/u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/backupset/2019_03_25/o1_mf_nnnd1_TAG20190325T130915_g9l2sx2j_.bkp tag=TAG20190325T130915 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:25
channel ORA_DISK_1: starting compressed incremental level 1 datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00010 name=/u01/app/oracle/oradata/orclcdb/pdb1/sysaux01.dbf
skipping datafile 00010 because it has not changed
input datafile file number=00009 name=/u01/app/oracle/oradata/orclcdb/pdb1/system01.dbf
skipping datafile 00009 because it has not changed
input datafile file number=00011 name=/u01/app/oracle/oradata/orclcdb/pdb1/users01.dbf
skipping datafile 00011 because it has not changed
channel ORA_DISK_1: backup cancelled because all files were skipped
channel ORA_DISK_1: starting compressed incremental level 1 datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00006 name=/u01/app/oracle/oradata/orclcdb/pdbseed/sysaux01.dbf
skipping datafile 00006 because it has not changed
input datafile file number=00005 name=/u01/app/oracle/oradata/orclcdb/pdbseed/system01.dbf
skipping datafile 00005 because it has not changed
channel ORA_DISK_1: backup cancelled because all files were skipped
Finished backup at 25-MAR-19

Starting Control File and SPFILE Autobackup at 25-MAR-19
piece handle=/u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/autobackup/2019_03_25/o1_mf_s_1003842582_g9l2tq67_.bkp comment=NONE
Finished Control File and SPFILE Autobackup at 25-MAR-19

RMAN> exit


Recovery Manager complete.

Again, more changes come to source database as the form of a new table TABLE2.

[oracle@ora12c-1 ~]$ sqlplus / as sysdba

SQL*Plus: Release 12.2.0.1.0 Production on Mon Mar 25 13:10:43 2019

Copyright (c) 1982, 2016, Oracle. All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

SQL> create table TABLE2 as select * from dba_objects;

Table created.

SQL> exit
Disconnected from Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

And a new incremental level 1 backup will contain all these new changes.

[oracle@ora12c-1 ~]$ rman target /

Recovery Manager: Release 12.2.0.1.0 - Production on Mon Mar 25 13:11:32 2019

Copyright (c) 1982, 2017, Oracle and/or its affiliates. All rights reserved.

connected to target database: ORCLCDB (DBID=2774882222)

RMAN> backup incremental level 1 as compressed backupset database;

Starting backup at 25-MAR-19
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=70 device type=DISK
channel ORA_DISK_1: starting compressed incremental level 1 datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00001 name=/u01/app/oracle/oradata/orclcdb/system01.dbf
input datafile file number=00003 name=/u01/app/oracle/oradata/orclcdb/sysaux01.dbf
input datafile file number=00004 name=/u01/app/oracle/oradata/orclcdb/undotbs01.dbf
input datafile file number=00007 name=/u01/app/oracle/oradata/orclcdb/users01.dbf
channel ORA_DISK_1: starting piece 1 at 25-MAR-19
channel ORA_DISK_1: finished piece 1 at 25-MAR-19
piece handle=/u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/backupset/2019_03_25/o1_mf_nnnd1_TAG20190325T131140_g9l2yddm_.bkp tag=TAG20190325T131140 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:25
channel ORA_DISK_1: starting compressed incremental level 1 datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00010 name=/u01/app/oracle/oradata/orclcdb/pdb1/sysaux01.dbf
skipping datafile 00010 because it has not changed
input datafile file number=00009 name=/u01/app/oracle/oradata/orclcdb/pdb1/system01.dbf
skipping datafile 00009 because it has not changed
input datafile file number=00011 name=/u01/app/oracle/oradata/orclcdb/pdb1/users01.dbf
skipping datafile 00011 because it has not changed
channel ORA_DISK_1: backup cancelled because all files were skipped
channel ORA_DISK_1: starting compressed incremental level 1 datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00006 name=/u01/app/oracle/oradata/orclcdb/pdbseed/sysaux01.dbf
skipping datafile 00006 because it has not changed
input datafile file number=00005 name=/u01/app/oracle/oradata/orclcdb/pdbseed/system01.dbf
skipping datafile 00005 because it has not changed
channel ORA_DISK_1: backup cancelled because all files were skipped
Finished backup at 25-MAR-19

Starting Control File and SPFILE Autobackup at 25-MAR-19
piece handle=/u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/autobackup/2019_03_25/o1_mf_s_1003842726_g9l2z7b0_.bkp comment=NONE
Finished Control File and SPFILE Autobackup at 25-MAR-19

RMAN> exit


Recovery Manager complete.

These new incremental backups should be sent, cataloged and applied to the target database:

[oracle@ora12c-1 ~]$ rsync --progress -avz -e "ssh -i ~/EC2us-east-1.pem" /u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/ oracle@ora12c-2:/u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/
sending incremental file list
autobackup/2019_03_25/
autobackup/2019_03_25/o1_mf_s_1003842582_g9l2tq67_.bkp
18,825,216 100% 99.57MB/s 0:00:00 (xfr#1, to-chk=5/22)
autobackup/2019_03_25/o1_mf_s_1003842726_g9l2z7b0_.bkp
18,825,216 100% 50.01MB/s 0:00:00 (xfr#2, to-chk=4/22)
backupset/2019_03_25/
backupset/2019_03_25/o1_mf_nnnd1_TAG20190325T130915_g9l2sx2j_.bkp
2,539,520 100% 5.28MB/s 0:00:00 (xfr#3, to-chk=1/22)
backupset/2019_03_25/o1_mf_nnnd1_TAG20190325T131140_g9l2yddm_.bkp
2,211,840 100% 3.89MB/s 0:00:00 (xfr#4, to-chk=0/22)

sent 2,354,479 bytes received 115 bytes 1,569,729.33 bytes/sec
total size is 717,881,344 speedup is 304.89
[oracle@ora12c-2 ~]$ rman target /

Recovery Manager: Release 12.2.0.1.0 - Production on Mon Mar 25 13:18:45 2019

Copyright (c) 1982, 2017, Oracle and/or its affiliates. All rights reserved.

connected to target database: ORCLCDB (DBID=2774882222, not open)

RMAN> catalog start with '/u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/' noprompt;

using target database control file instead of recovery catalog
searching for all files that match the pattern /u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/

List of Files Unknown to the Database
=====================================
File Name: /u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/backupset/2019_03_25/o1_mf_nnnd1_TAG20190325T131140_g9l2yddm_.bkp
File Name: /u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/backupset/2019_03_25/o1_mf_nnnd1_TAG20190325T130915_g9l2sx2j_.bkp
File Name: /u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/autobackup/2019_03_25/o1_mf_s_1003842582_g9l2tq67_.bkp
File Name: /u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/autobackup/2019_03_25/o1_mf_s_1003842726_g9l2z7b0_.bkp
cataloging files...
cataloging done

List of Cataloged Files
=======================
File Name: /u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/backupset/2019_03_25/o1_mf_nnnd1_TAG20190325T131140_g9l2yddm_.bkp
File Name: /u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/backupset/2019_03_25/o1_mf_nnnd1_TAG20190325T130915_g9l2sx2j_.bkp
File Name: /u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/autobackup/2019_03_25/o1_mf_s_1003842582_g9l2tq67_.bkp
File Name: /u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/autobackup/2019_03_25/o1_mf_s_1003842726_g9l2z7b0_.bkp

RMAN> recover database;

Starting recover at 25-MAR-19
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=27 device type=DISK
channel ORA_DISK_1: starting incremental datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
destination for restore of datafile 00001: /u01/app/oracle/oradata/orclcdb/system01.dbf
destination for restore of datafile 00003: /u01/app/oracle/oradata/orclcdb/sysaux01.dbf
destination for restore of datafile 00004: /u01/app/oracle/oradata/orclcdb/undotbs01.dbf
destination for restore of datafile 00007: /u01/app/oracle/oradata/orclcdb/users01.dbf
channel ORA_DISK_1: reading from backup piece /u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/backupset/2019_03_25/o1_mf_nnnd1_TAG20190325T130915_g9l2sx2j_.bkp
channel ORA_DISK_1: piece handle=/u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/backupset/2019_03_25/o1_mf_nnnd1_TAG20190325T130915_g9l2sx2j_.bkp tag=TAG20190325T130915
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:03
channel ORA_DISK_1: starting incremental datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
destination for restore of datafile 00001: /u01/app/oracle/oradata/orclcdb/system01.dbf
destination for restore of datafile 00003: /u01/app/oracle/oradata/orclcdb/sysaux01.dbf
destination for restore of datafile 00004: /u01/app/oracle/oradata/orclcdb/undotbs01.dbf
destination for restore of datafile 00007: /u01/app/oracle/oradata/orclcdb/users01.dbf
channel ORA_DISK_1: reading from backup piece /u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/backupset/2019_03_25/o1_mf_nnnd1_TAG20190325T131140_g9l2yddm_.bkp
channel ORA_DISK_1: piece handle=/u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/backupset/2019_03_25/o1_mf_nnnd1_TAG20190325T131140_g9l2yddm_.bkp tag=TAG20190325T131140
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01

starting media recovery

unable to find archived log
archived log thread=1 sequence=2
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of recover command at 03/25/2019 13:19:15
RMAN-06054: media recovery requesting unknown archived log for thread 1 with sequence 2 and starting SCN of 1448493

RMAN> exit

Recovery Manager complete.

Please note the RMAN error at the end of the RECOVER. It says there is an unknown archive log for the sequence 2. This is because the last changes in the source database are not yet archived, they are only in the redo log files.

So, this error is expected and we can ignore it now. At the end of the process we will apply all the changes and no data will be lost. I could avoid the error by issuing a RECOVER with UNTIL SEQUENCE.

11. Start the cutover window: shutdown the source database and perform the last incremental backup

Ok, so you target database have been updated for some days or weeks thru incremental backups, and now it’s time to perform the final steps to finally make our new database available to the users in replacement of the old one in the source.

First, let’s simulate the changes occurred in the source database by creating a new table TABLE3.

[oracle@ora12c-1 ~]$ sqlplus / as sysdba

SQL*Plus: Release 12.2.0.1.0 Production on Mon Mar 25 13:28:17 2019

Copyright (c) 1982, 2016, Oracle. All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

SQL> create table TABLE3 as select * from dba_objects;

Table created.

SQL> exit
Disconnected from Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

Then, the cutover starts with us shutting down the source database and performing the last incremental backup:

[oracle@ora12c-1 ~]$ rman target /

Recovery Manager: Release 12.2.0.1.0 - Production on Mon Mar 25 13:29:55 2019

Copyright (c) 1982, 2017, Oracle and/or its affiliates. All rights reserved.

connected to target database: ORCLCDB (DBID=2774882222)

RMAN> shutdown immediate

using target database control file instead of recovery catalog
database closed
database dismounted
Oracle instance shut down

RMAN> startup mount

connected to target database (not started)
Oracle instance started
database mounted

Total System Global Area 419430400 bytes

Fixed Size 8793496 bytes
Variable Size 297796200 bytes
Database Buffers 109051904 bytes
Redo Buffers 3788800 bytes

RMAN> backup incremental level 1 as compressed backupset database;

Starting backup at 25-MAR-19
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=35 device type=DISK
channel ORA_DISK_1: starting compressed incremental level 1 datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00001 name=/u01/app/oracle/oradata/orclcdb/system01.dbf
input datafile file number=00003 name=/u01/app/oracle/oradata/orclcdb/sysaux01.dbf
input datafile file number=00004 name=/u01/app/oracle/oradata/orclcdb/undotbs01.dbf
input datafile file number=00007 name=/u01/app/oracle/oradata/orclcdb/users01.dbf
channel ORA_DISK_1: starting piece 1 at 25-MAR-19
channel ORA_DISK_1: finished piece 1 at 25-MAR-19
piece handle=/u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/backupset/2019_03_25/o1_mf_nnnd1_TAG20190325T133040_g9l420t6_.bkp tag=TAG20190325T133040 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:25
channel ORA_DISK_1: starting compressed incremental level 1 datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00010 name=/u01/app/oracle/oradata/orclcdb/pdb1/sysaux01.dbf
skipping datafile 00010 because it has not changed
input datafile file number=00009 name=/u01/app/oracle/oradata/orclcdb/pdb1/system01.dbf
skipping datafile 00009 because it has not changed
input datafile file number=00011 name=/u01/app/oracle/oradata/orclcdb/pdb1/users01.dbf
skipping datafile 00011 because it has not changed
channel ORA_DISK_1: backup cancelled because all files were skipped
channel ORA_DISK_1: starting compressed incremental level 1 datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00006 name=/u01/app/oracle/oradata/orclcdb/pdbseed/sysaux01.dbf
skipping datafile 00006 because it has not changed
input datafile file number=00005 name=/u01/app/oracle/oradata/orclcdb/pdbseed/system01.dbf
skipping datafile 00005 because it has not changed
channel ORA_DISK_1: backup cancelled because all files were skipped
Finished backup at 25-MAR-19

Starting Control File and SPFILE Autobackup at 25-MAR-19
piece handle=/u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/autobackup/2019_03_25/o1_mf_s_1003843810_g9l42thh_.bkp comment=NONE
Finished Control File and SPFILE Autobackup at 25-MAR-19

RMAN> exit

Recovery Manager complete.

12. Copy the new backup files to the target

[oracle@ora12c-1 ~]$ rsync --progress -avz -e "ssh -i ~/EC2us-east-1.pem" /u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/ oracle@ora12c-2:/u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/
sending incremental file list
autobackup/2019_03_25/
autobackup/2019_03_25/o1_mf_s_1003843810_g9l42thh_.bkp
18,825,216 100% 107.32MB/s 0:00:00 (xfr#1, to-chk=5/24)
backupset/2019_03_25/
backupset/2019_03_25/o1_mf_nnnd1_TAG20190325T133040_g9l420t6_.bkp
2,768,896 100% 9.60MB/s 0:00:00 (xfr#2, to-chk=0/24)

sent 1,580,107 bytes received 77 bytes 1,053,456.00 bytes/sec
total size is 739,475,456 speedup is 467.97

13. Copy the Redo Log Files and the up-to-date Control File to the target

Here is the trick that will allow us open the database without using RESETLOGS: we will copy the online redo log files and the up-to-date control file to the target.

Let me explain why it is necessary.

First of all, the control file is the file Oracle uses to control the last change (SCN) made to a database. So, if you have the current control file, i.e., not one restored from a backup, then surely the last SCN is written in the control file and so Oracle knows the last change.

On the other hand, if you restore a control file from the backup, then obviously the SCN information inside it is obsolete and cannot be used to know what was the last change. And, if the control file has not the last SCN, then it must be updated and it is done when you open the database with the RESETLOGS option. No matter if you were able to apply all the changes or not, if the control file came from a backup, your database must be opened with RESETLOGS.

Let’s take a look at the control file of the target database at this point:

[oracle@ora12c-2 ~]$ sqlplus / as sysdba

SQL*Plus: Release 12.2.0.1.0 Production on Mon Mar 25 13:43:36 2019

Copyright (c) 1982, 2016, Oracle. All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

SQL> select controlfile_type from v$database;

CONTROL
-------
BACKUP

SQL> exit
Disconnected from Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

As you can see, the target database has its control file marked as a backup. RMAN did it.

So, with this control file, the only way to open the target database is with RESETLOGS. But we want to avoid it, and the way to achieve this is by copying the last control file from the source to the target. This copy must be done manually, because any control file generated by RMAN will be marked as a backup.

We also need to copy the online redo log files because they could have some last changes that were not archived yet. Even if you are sure there is no change unarchived, the redos are still necessary because Oracle needs to check inside them to be sure it was able to apply all the changes and so the database is up to date.

Before I can copy the files, I will shutdown the target database because it will have its control files replaced.

[oracle@ora12c-2 ~]$ rman target /

Recovery Manager: Release 12.2.0.1.0 - Production on Mon Mar 25 13:54:17 2019

Copyright (c) 1982, 2017, Oracle and/or its affiliates. All rights reserved.

connected to target database: ORCLCDB (DBID=2774882222, not open)

RMAN> shutdown immediate

using target database control file instead of recovery catalog
database dismounted
Oracle instance shut down

RMAN> exit

Recovery Manager complete.

So, let’s copy the files. First, let me see where are my control files and redo log files in the source:

[oracle@ora12c-1 ~]$ sqlplus / as sysdba

SQL*Plus: Release 12.2.0.1.0 Production on Mon Mar 25 13:56:01 2019

Copyright (c) 1982, 2016, Oracle. All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

SQL> select name from v$controlfile;

NAME
--------------------------------------------------------------------------------
/u01/app/oracle/oradata/orclcdb/control01.ctl
/u01/app/oracle/fast_recovery_area/orclcdb/control02.ctl

SQL> select member from v$logfile;

MEMBER
--------------------------------------------------------------------------------
/u01/app/oracle/oradata/orclcdb/redo03.log
/u01/app/oracle/oradata/orclcdb/redo02.log
/u01/app/oracle/oradata/orclcdb/redo01.log

SQL> exit
Disconnected from Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

Not, let’s copy the files to the target:

[oracle@ora12c-1 ~]$ scp -i ~/EC2us-east-1.pem /u01/app/oracle/oradata/orclcdb/control01.ctl oracle@ora12c-2:/u01/app/oracle/oradata/orclcdb/
control01.ctl 100% 18MB 117.7MB/s 00:00 
[oracle@ora12c-1 ~]$ scp -i ~/EC2us-east-1.pem /u01/app/oracle/fast_recovery_area/orclcdb/control02.ctl oracle@ora12c-2:/u01/app/oracle/fast_recovery_area/orclcdb/
control02.ctl 100% 18MB 117.8MB/s 00:00 
[oracle@ora12c-1 ~]$ scp -i ~/EC2us-east-1.pem /u01/app/oracle/oradata/orclcdb/redo01.log oracle@ora12c-2:/u01/app/oracle/oradata/orclcdb/
redo01.log 100% 200MB 87.7MB/s 00:02 
[oracle@ora12c-1 ~]$ scp -i ~/EC2us-east-1.pem /u01/app/oracle/oradata/orclcdb/redo02.log oracle@ora12c-2:/u01/app/oracle/oradata/orclcdb/
redo02.log 100% 200MB 67.1MB/s 00:02 
[oracle@ora12c-1 ~]$ scp -i ~/EC2us-east-1.pem /u01/app/oracle/oradata/orclcdb/redo03.log oracle@ora12c-2:/u01/app/oracle/oradata/orclcdb/
redo03.log 100% 200MB 66.9MB/s 00:02

14. Mount and recover the target database

Now we have the latest control files and redo log files. So this time the RECOVER statement must complete successfully by applying all the changes and confirming the media recovery is complete.

[oracle@ora12c-2 ~]$ rman target /

Recovery Manager: Release 12.2.0.1.0 - Production on Mon Mar 25 14:01:54 2019

Copyright (c) 1982, 2017, Oracle and/or its affiliates. All rights reserved.

connected to target database (not started)

RMAN> startup mount

Oracle instance started
database mounted

Total System Global Area 419430400 bytes

Fixed Size 8793496 bytes
Variable Size 297796200 bytes
Database Buffers 109051904 bytes
Redo Buffers 3788800 bytes

RMAN> recover database;

Starting recover at 25-MAR-19
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=35 device type=DISK
channel ORA_DISK_1: starting incremental datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
destination for restore of datafile 00001: /u01/app/oracle/oradata/orclcdb/system01.dbf
destination for restore of datafile 00003: /u01/app/oracle/oradata/orclcdb/sysaux01.dbf
destination for restore of datafile 00004: /u01/app/oracle/oradata/orclcdb/undotbs01.dbf
destination for restore of datafile 00007: /u01/app/oracle/oradata/orclcdb/users01.dbf
channel ORA_DISK_1: reading from backup piece /u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/backupset/2019_03_25/o1_mf_nnnd1_TAG20190325T133040_g9l420t6_.bkp
channel ORA_DISK_1: piece handle=/u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/backupset/2019_03_25/o1_mf_nnnd1_TAG20190325T133040_g9l420t6_.bkp tag=TAG20190325T133040
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01

starting media recovery
media recovery complete, elapsed time: 00:00:00

Finished recover at 25-MAR-19

Let’s check and confirm if our control file is not marked as a backup anymore:

RMAN> select controlfile_type from v$database;

CONTROL
-------
CURRENT

Now the control file is recognised as current, which means the database can be opened without RESETLOGS, as long as all the data files are correctly recovered. Let’s check if there is any file needing recovery:

RMAN> select * from v$recover_file;

no rows selected

Nothing to be recovered.

15. Open the target database

Everything is OK, so I can open the database in the target server without using RESETLOGS:

RMAN> alter database open;

Statement processed

RMAN> exit

Recovery Manager complete.

Just to confirm that we didn’t loose any information, let’s check if all the tables we created in the source to mimic data changes over time are available in the target:

[oracle@ora12c-2 ~]$ sqlplus / as sysdba

SQL*Plus: Release 12.2.0.1.0 Production on Mon Mar 25 14:26:28 2019

Copyright (c) 1982, 2016, Oracle. All rights reserved.

Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

SQL> select count(*) from table1;

COUNT(*)
----------
72633

SQL> select count(*) from table2;

COUNT(*)
----------
72634

SQL> select count(*) from table3;

COUNT(*)
----------
72635

Great, everything is OK.

16. Rollback to the source server

As long as the source database was not opened after the cutover steps, you could perform the reverse operation to rollback to the old server. For this, you should repeat all the steps beginning with the one named “Start the cutover”, but this time swapping roles between source and target databases.

That’s it. I hope you enjoy!

Como mover um banco de dados Oracle para outro servidor SEM precisar abrir com RESETLOGS

Português/English

É bem conhecida a técnica de mover um banco de dados Oracle para um novo servidor, como explicado por exemplo neste link da documentação. Também é bem conhecida a técnica para continuar atualizando o banco no destino até o momento da mudança.

Basicamente, os passos são os seguintes:

  1. Prepare o novo servidor (instale o software Oracle, etc).
  2. Copie todos os arquivos de configuração do Oracle (spfile, password file, listener.ora, tnsnames.ora, etc) e faça os ajustes necessários.
  3. Execute um backup completo do banco na origem.
  4. Copie os arquivos de backup para o destino a restaure o banco a partir deles.
  5. Enquanto espera o momento da mudança, execute backups incrementais periodicamente na origem e atualize o banco no destino.
  6. No momento da mudança, execute um último backup incremental do banco na origem em estado MOUNT, copie e aplique no destino.
  7. Abra o banco no destino com a opção RESETLOGS.

O principal problema deste processo é o fato de que o banco é aberto com RESETLOGS, o que o faz ser uma nova encarnação, quase como que um banco novo. Na prática, você terá que começar seu histórico de backups do zero, e se algo não for bem no servidor destino é impossível voltar a usar o banco na origem, a menos que você aceite perder todas as alterações feitas no destino depois que o banco foi aberto.

Se você está usando o Oracle Enterprise Edition, você pode configurar um DataGuard entre os bancos, e assim pode alternar para o novo servidor quase sem tempo de parada, e também reverter para a origem caso algo dê errado sem qualquer perda de dados.

Mas, e se você não tem a licença do Oracle EE, ou por qualquer motivo não quer configurar o DataGuard? Há uma maneira de abrir o banco no destino sem usar o RESETLOGS, e também de reverter para a origem caso necessário?

Eu encontrei muito pouca documentação sobre isso, e portanto este post é para explicar por que normalmente um RESETLOGS seria necessário nesta situação, e como superar esta limitação.

Em primeiro lugar, algumas informações sobre o meu ambiente:

  • Estou usando dois servidores virtuais Oracle Linux na AWS com Oracle 12.2 instalado.
  • Os servidores se chamam ora12c-1 (a origem) e ora12c-2 (o destino). O prompt do Linux identifica em qual dos servidores eu estou trabalhando.
  • Na origem eu tenho apenas o banco orcldb que será usado aqui. No destino não há nenhum banco de dados.
  • Ambos os servidores podem se comunicar através da rede, e portanto eu posso transferir arquivos facilmente entre eles.
  • Todas as variáveis de ambiente estão configuradas adequadamente para o usuário oracle. Portanto, sempre que eu me conectar a um banco de dados será o correto.
  • Os arquivos do banco de dados, FRA, backups e tudo o mais serão copiados para os mesmos caminhos no destino.

Então vamos começar.

1. Prepare o banco na origem

Eu apenas quero verificar se o banco está em nodo ARCHIVELOG, de maneira que eu possa executar todos os backups online:

[oracle@ora12c-1 ~]$ sqlplus / as sysdba

SQL*Plus: Release 12.2.0.1.0 Production on Mon Mar 25 11:33:13 2019

Copyright (c) 1982, 2016, Oracle.  All rights reserved.

Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

SQL> select name, log_mode from v$database;

NAME	  LOG_MODE
--------- ------------
ORCLCDB   ARCHIVELOG

SQL> exit
Disconnected from Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

2. Prepare o servidor destino

Aqui eu vou apenas criar os diretórios que serão usados pelo banco de dados, nos mesmos caminhos que existem na origem:

[oracle@ora12c-2 ~]$ mkdir -p $ORACLE_BASE/admin/orclcdb/adump
[oracle@ora12c-2 ~]$ mkdir -p $ORACLE_BASE/fast_recovery_area/orclcdb/ORCLCDB

3. Copie o SPFILE, password file e arquivos de configuração de rede para o destino e ajuste-os

Vamos copiar os arquivos de configuração para o destino:

[oracle@ora12c-1 ~]$ scp -i ~/EC2us-east-1.pem $ORACLE_HOME/dbs/spfileorclcdb.ora oracle@ora12c-2:$ORACLE_HOME/dbs/
spfileorclcdb.ora 100% 3584 4.9MB/s 00:00 
[oracle@ora12c-1 ~]$ scp -i ~/EC2us-east-1.pem $ORACLE_HOME/dbs/orapworclcdb oracle@ora12c-2:$ORACLE_HOME/dbs/
orapworclcdb 100% 3584 2.9MB/s 00:00 
[oracle@ora12c-1 ~]$ scp -i ~/EC2us-east-1.pem $ORACLE_HOME/network/admin/*ora oracle@ora12c-2:$ORACLE_HOME/network/admin/
listener.ora 100% 336 264.6KB/s 00:00 
sqlnet.ora 100% 202 364.2KB/s 00:00 
tnsnames.ora 100% 494 528.6KB/s 00:00

Depois de copiar estes arquivos, você deve revisar o conteúdo dos arquivos de rede e fazer os ajustes necessários, considerando que o servidor é diferente.

4. Execute um backup completo na origem

O primeiro backup na origem precisa ser completo. Na verdade, precisa ser um incremental de nível 0 para que os subsequentes nível 1 o considerem.

Para o método que vamos usar, não há necessidade de fazer backup dos Archived Logs, porque o banco no destino será atualizado somente com backups incrementais.

[oracle@ora12c-1 ~]$ rman target /

Recovery Manager: Release 12.2.0.1.0 - Production on Mon Mar 25 12:17:15 2019

Copyright (c) 1982, 2017, Oracle and/or its affiliates. All rights reserved.

connected to target database: ORCLCDB (DBID=2774882222, not open)

RMAN> backup incremental level 0 as compressed backupset database;

Starting backup at 25-MAR-19
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=24 device type=DISK
channel ORA_DISK_1: starting compressed incremental level 0 datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00001 name=/u01/app/oracle/oradata/orclcdb/system01.dbf
input datafile file number=00003 name=/u01/app/oracle/oradata/orclcdb/sysaux01.dbf
input datafile file number=00004 name=/u01/app/oracle/oradata/orclcdb/undotbs01.dbf
input datafile file number=00007 name=/u01/app/oracle/oradata/orclcdb/users01.dbf
channel ORA_DISK_1: starting piece 1 at 25-MAR-19
channel ORA_DISK_1: finished piece 1 at 25-MAR-19
piece handle=/u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/backupset/2019_03_25/o1_mf_nnnd0_TAG20190325T121727_g9kzrqyb_.bkp tag=TAG20190325T121727 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:45
channel ORA_DISK_1: starting compressed incremental level 0 datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00010 name=/u01/app/oracle/oradata/orclcdb/pdb1/sysaux01.dbf
input datafile file number=00009 name=/u01/app/oracle/oradata/orclcdb/pdb1/system01.dbf
input datafile file number=00011 name=/u01/app/oracle/oradata/orclcdb/pdb1/users01.dbf
channel ORA_DISK_1: starting piece 1 at 25-MAR-19
channel ORA_DISK_1: finished piece 1 at 25-MAR-19
piece handle=/u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/84EA94D2E64025B8E053AB5C1FAC5F38/backupset/2019_03_25/o1_mf_nnnd0_TAG20190325T121727_g9kzt54y_.bkp tag=TAG20190325T121727 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:25
channel ORA_DISK_1: starting compressed incremental level 0 datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00006 name=/u01/app/oracle/oradata/orclcdb/pdbseed/sysaux01.dbf
input datafile file number=00005 name=/u01/app/oracle/oradata/orclcdb/pdbseed/system01.dbf
channel ORA_DISK_1: starting piece 1 at 25-MAR-19
channel ORA_DISK_1: finished piece 1 at 25-MAR-19
piece handle=/u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/84EA3D138B311A47E053AB5C1FACC44E/backupset/2019_03_25/o1_mf_nnnd0_TAG20190325T121727_g9kzty8l_.bkp tag=TAG20190325T121727 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:25
Finished backup at 25-MAR-19

Starting Control File and SPFILE Autobackup at 25-MAR-19
piece handle=/u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/autobackup/2019_03_25/o1_mf_s_1003825815_g9kzvqgf_.bkp comment=NONE
Finished Control File and SPFILE Autobackup at 25-MAR-19

RMAN> exit

Recovery Manager complete.

Vou anotar o backup automático do control file gerado, para restaurar no destino: /u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/autobackup/2019_03_25/o1_mf_s_1003825815_g9kzvqgf_.bkp .

5. Copie os arquivos de backup para o destino

Como eu vou ter que copiar os arquivos de backup para o destino várias vezes, vou manter a FRA sincronizada nos dois servidores com o rsync:

[oracle@ora12c-1 ~]$ rsync --progress -avz -e "ssh -i ~/EC2us-east-1.pem" /u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/ oracle@ora12c-2:/u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/
sending incremental file list
./
84EA3D138B311A47E053AB5C1FACC44E/
84EA3D138B311A47E053AB5C1FACC44E/backupset/
84EA3D138B311A47E053AB5C1FACC44E/backupset/2019_03_25/
84EA3D138B311A47E053AB5C1FACC44E/backupset/2019_03_25/o1_mf_nnnd0_TAG20190325T121727_g9kzty8l_.bkp
    164,388,864 100%   20.01MB/s    0:00:07 (xfr#1, to-chk=8/18)
84EA94D2E64025B8E053AB5C1FAC5F38/
84EA94D2E64025B8E053AB5C1FAC5F38/backupset/
84EA94D2E64025B8E053AB5C1FAC5F38/backupset/2019_03_25/
84EA94D2E64025B8E053AB5C1FAC5F38/backupset/2019_03_25/o1_mf_nnnd0_TAG20190325T121727_g9kzt54y_.bkp
    164,503,552 100%   18.03MB/s    0:00:08 (xfr#2, to-chk=5/18)
archivelog/
archivelog/2019_03_25/
autobackup/
autobackup/2019_03_25/
autobackup/2019_03_25/o1_mf_s_1003825815_g9kzvqgf_.bkp
     18,825,216 100%   20.31MB/s    0:00:00 (xfr#3, to-chk=2/18)
backupset/
backupset/2019_03_25/
backupset/2019_03_25/o1_mf_nnnd0_TAG20190325T121727_g9kzrqyb_.bkp
    327,761,920 100%   18.43MB/s    0:00:16 (xfr#4, to-chk=0/18)
onlinelog/

sent 627,374,884 bytes  received 163 bytes  19,303,847.60 bytes/sec
total size is 675,479,552  speedup is 1.08

6. Desligue a OMF no destino

Se você está utilizando Oracle Managed Files, eu recomendo que você desligue este recurso temporariamente no destino. Com OMF configurado, quando você faz um restore via RMAN os novos data files gerados têm um sufixo diferente no nome. Então teríamos que executar comandos SET NEWNAME para que o control file reconhecesse os novos arquivos. E como o control file será copiado novamente durante a mudança, os comandos teriam que ser reexecutados, aumentando a complexidade e a chance de erros.

[oracle@ora12c-2 ~]$ echo $ORACLE_SID
orclcdb
[oracle@ora12c-2 ~]$ sqlplus / as sysdba

SQL*Plus: Release 12.2.0.1.0 Production on Mon Mar 25 12:45:37 2019

Copyright (c) 1982, 2016, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup nomount
ORACLE instance started.

Total System Global Area  419430400 bytes
Fixed Size		    8793496 bytes
Variable Size		  297796200 bytes
Database Buffers	  109051904 bytes
Redo Buffers		    3788800 bytes
SQL> alter system set db_create_file_dest='' scope=both;

System altered.

SQL> exit
Disconnected from Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

7. Restaure o control file e monte o banco de dados

Vamos restaurar o control file do backup automático e montar o banco:

[oracle@ora12c-2 ~]$ rman target /

Recovery Manager: Release 12.2.0.1.0 - Production on Mon Mar 25 12:48:59 2019

Copyright (c) 1982, 2017, Oracle and/or its affiliates.  All rights reserved.

connected to target database: ORCLCDB (not mounted)

RMAN> restore controlfile from '/u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/autobackup/2019_03_25/o1_mf_s_1003825815_g9kzvqgf_.bkp';

Starting restore at 25-MAR-19
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=35 device type=DISK

channel ORA_DISK_1: restoring control file
channel ORA_DISK_1: restore complete, elapsed time: 00:00:02
output file name=/u01/app/oracle/oradata/orclcdb/control01.ctl
output file name=/u01/app/oracle/fast_recovery_area/orclcdb/control02.ctl
Finished restore at 25-MAR-19

RMAN> alter database mount;

Statement processed
released channel: ORA_DISK_1

9. Restaure o banco de dados

Agora é hora de restaurar o banco de dados, isto é, os data files. Perceba que, como o control file foi restaurado de um backup realizado depois do backup dos data files, ele já tem a informação sobre estes backups e nada portanto precisa ser catalogado.

RMAN> restore database;

Starting restore at 25-MAR-19
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=1 device type=DISK

channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00001 to /u01/app/oracle/oradata/orclcdb/system01.dbf
channel ORA_DISK_1: restoring datafile 00003 to /u01/app/oracle/oradata/orclcdb/sysaux01.dbf
channel ORA_DISK_1: restoring datafile 00004 to /u01/app/oracle/oradata/orclcdb/undotbs01.dbf
channel ORA_DISK_1: restoring datafile 00007 to /u01/app/oracle/oradata/orclcdb/users01.dbf
channel ORA_DISK_1: reading from backup piece /u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/backupset/2019_03_25/o1_mf_nnnd0_TAG20190325T121727_g9kzrqyb_.bkp
channel ORA_DISK_1: piece handle=/u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/backupset/2019_03_25/o1_mf_nnnd0_TAG20190325T121727_g9kzrqyb_.bkp tag=TAG20190325T121727
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:55
channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00009 to /u01/app/oracle/oradata/orclcdb/pdb1/system01.dbf
channel ORA_DISK_1: restoring datafile 00010 to /u01/app/oracle/oradata/orclcdb/pdb1/sysaux01.dbf
channel ORA_DISK_1: restoring datafile 00011 to /u01/app/oracle/oradata/orclcdb/pdb1/users01.dbf
channel ORA_DISK_1: reading from backup piece /u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/84EA94D2E64025B8E053AB5C1FAC5F38/backupset/2019_03_25/o1_mf_nnnd0_TAG20190325T121727_g9kzt54y_.bkp
channel ORA_DISK_1: piece handle=/u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/84EA94D2E64025B8E053AB5C1FAC5F38/backupset/2019_03_25/o1_mf_nnnd0_TAG20190325T121727_g9kzt54y_.bkp tag=TAG20190325T121727
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:25
channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00005 to /u01/app/oracle/oradata/orclcdb/pdbseed/system01.dbf
channel ORA_DISK_1: restoring datafile 00006 to /u01/app/oracle/oradata/orclcdb/pdbseed/sysaux01.dbf
channel ORA_DISK_1: reading from backup piece /u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/84EA3D138B311A47E053AB5C1FACC44E/backupset/2019_03_25/o1_mf_nnnd0_TAG20190325T121727_g9kzty8l_.bkp
channel ORA_DISK_1: piece handle=/u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/84EA3D138B311A47E053AB5C1FACC44E/backupset/2019_03_25/o1_mf_nnnd0_TAG20190325T121727_g9kzty8l_.bkp tag=TAG20190325T121727
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:25
Finished restore at 25-MAR-19

RMAN> exit

Recovery Manager complete.

Neste momento, nós temos o banco destino em estado MOUNT e atualizado até o momento do último backup.

10. Continue atualizando o destino com backups incrementais

Como o banco na origem está aberto e sendo atualizado ao longo do tempo, é necessário aplicar todas estas alterações no destino. Nós faremos isso com backups incrementais periódicos. Eles podem ser executados semanalmente, diariamente ou até mais frequentes, de acordo com o volume de alterações.

Por exemplo, vou simular dois ciclos de alterações no banco e backups incrementais que conterão estas alterações.

Vou criar a tabela TABLE1 para simular as mudanças no banco origem:

[oracle@ora12c-1 ~]$ sqlplus / as sysdba

SQL*Plus: Release 12.2.0.1.0 Production on Mon Mar 25 13:06:35 2019

Copyright (c) 1982, 2016, Oracle.  All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

SQL> create table TABLE1 as select * from dba_objects;

Table created.

SQL> exit
Disconnected from Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

Então eu crio um backup incremental de nível 1:

[oracle@ora12c-1 ~]$ rman target /

Recovery Manager: Release 12.2.0.1.0 - Production on Mon Mar 25 13:09:06 2019

Copyright (c) 1982, 2017, Oracle and/or its affiliates. All rights reserved.

connected to target database: ORCLCDB (DBID=2774882222)

RMAN> backup incremental level 1 as compressed backupset database;

Starting backup at 25-MAR-19
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=68 device type=DISK
channel ORA_DISK_1: starting compressed incremental level 1 datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00001 name=/u01/app/oracle/oradata/orclcdb/system01.dbf
input datafile file number=00003 name=/u01/app/oracle/oradata/orclcdb/sysaux01.dbf
input datafile file number=00004 name=/u01/app/oracle/oradata/orclcdb/undotbs01.dbf
input datafile file number=00007 name=/u01/app/oracle/oradata/orclcdb/users01.dbf
channel ORA_DISK_1: starting piece 1 at 25-MAR-19
channel ORA_DISK_1: finished piece 1 at 25-MAR-19
piece handle=/u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/backupset/2019_03_25/o1_mf_nnnd1_TAG20190325T130915_g9l2sx2j_.bkp tag=TAG20190325T130915 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:25
channel ORA_DISK_1: starting compressed incremental level 1 datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00010 name=/u01/app/oracle/oradata/orclcdb/pdb1/sysaux01.dbf
skipping datafile 00010 because it has not changed
input datafile file number=00009 name=/u01/app/oracle/oradata/orclcdb/pdb1/system01.dbf
skipping datafile 00009 because it has not changed
input datafile file number=00011 name=/u01/app/oracle/oradata/orclcdb/pdb1/users01.dbf
skipping datafile 00011 because it has not changed
channel ORA_DISK_1: backup cancelled because all files were skipped
channel ORA_DISK_1: starting compressed incremental level 1 datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00006 name=/u01/app/oracle/oradata/orclcdb/pdbseed/sysaux01.dbf
skipping datafile 00006 because it has not changed
input datafile file number=00005 name=/u01/app/oracle/oradata/orclcdb/pdbseed/system01.dbf
skipping datafile 00005 because it has not changed
channel ORA_DISK_1: backup cancelled because all files were skipped
Finished backup at 25-MAR-19

Starting Control File and SPFILE Autobackup at 25-MAR-19
piece handle=/u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/autobackup/2019_03_25/o1_mf_s_1003842582_g9l2tq67_.bkp comment=NONE
Finished Control File and SPFILE Autobackup at 25-MAR-19

RMAN> exit


Recovery Manager complete.

Novamente, mais alterações ocorrem na origem, na forma da nova tabela TABLE2:

[oracle@ora12c-1 ~]$ sqlplus / as sysdba

SQL*Plus: Release 12.2.0.1.0 Production on Mon Mar 25 13:10:43 2019

Copyright (c) 1982, 2016, Oracle. All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

SQL> create table TABLE2 as select * from dba_objects;

Table created.

SQL> exit
Disconnected from Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

E um novo backup incremental de nível 1 conterá estas alterações:

[oracle@ora12c-1 ~]$ rman target /

Recovery Manager: Release 12.2.0.1.0 - Production on Mon Mar 25 13:11:32 2019

Copyright (c) 1982, 2017, Oracle and/or its affiliates. All rights reserved.

connected to target database: ORCLCDB (DBID=2774882222)

RMAN> backup incremental level 1 as compressed backupset database;

Starting backup at 25-MAR-19
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=70 device type=DISK
channel ORA_DISK_1: starting compressed incremental level 1 datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00001 name=/u01/app/oracle/oradata/orclcdb/system01.dbf
input datafile file number=00003 name=/u01/app/oracle/oradata/orclcdb/sysaux01.dbf
input datafile file number=00004 name=/u01/app/oracle/oradata/orclcdb/undotbs01.dbf
input datafile file number=00007 name=/u01/app/oracle/oradata/orclcdb/users01.dbf
channel ORA_DISK_1: starting piece 1 at 25-MAR-19
channel ORA_DISK_1: finished piece 1 at 25-MAR-19
piece handle=/u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/backupset/2019_03_25/o1_mf_nnnd1_TAG20190325T131140_g9l2yddm_.bkp tag=TAG20190325T131140 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:25
channel ORA_DISK_1: starting compressed incremental level 1 datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00010 name=/u01/app/oracle/oradata/orclcdb/pdb1/sysaux01.dbf
skipping datafile 00010 because it has not changed
input datafile file number=00009 name=/u01/app/oracle/oradata/orclcdb/pdb1/system01.dbf
skipping datafile 00009 because it has not changed
input datafile file number=00011 name=/u01/app/oracle/oradata/orclcdb/pdb1/users01.dbf
skipping datafile 00011 because it has not changed
channel ORA_DISK_1: backup cancelled because all files were skipped
channel ORA_DISK_1: starting compressed incremental level 1 datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00006 name=/u01/app/oracle/oradata/orclcdb/pdbseed/sysaux01.dbf
skipping datafile 00006 because it has not changed
input datafile file number=00005 name=/u01/app/oracle/oradata/orclcdb/pdbseed/system01.dbf
skipping datafile 00005 because it has not changed
channel ORA_DISK_1: backup cancelled because all files were skipped
Finished backup at 25-MAR-19

Starting Control File and SPFILE Autobackup at 25-MAR-19
piece handle=/u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/autobackup/2019_03_25/o1_mf_s_1003842726_g9l2z7b0_.bkp comment=NONE
Finished Control File and SPFILE Autobackup at 25-MAR-19

RMAN> exit


Recovery Manager complete.

Estes novos backups incrementais precisam ser enviados, catalogados e aplicados no destino:

[oracle@ora12c-1 ~]$ rsync --progress -avz -e "ssh -i ~/EC2us-east-1.pem" /u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/ oracle@ora12c-2:/u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/
sending incremental file list
autobackup/2019_03_25/
autobackup/2019_03_25/o1_mf_s_1003842582_g9l2tq67_.bkp
18,825,216 100% 99.57MB/s 0:00:00 (xfr#1, to-chk=5/22)
autobackup/2019_03_25/o1_mf_s_1003842726_g9l2z7b0_.bkp
18,825,216 100% 50.01MB/s 0:00:00 (xfr#2, to-chk=4/22)
backupset/2019_03_25/
backupset/2019_03_25/o1_mf_nnnd1_TAG20190325T130915_g9l2sx2j_.bkp
2,539,520 100% 5.28MB/s 0:00:00 (xfr#3, to-chk=1/22)
backupset/2019_03_25/o1_mf_nnnd1_TAG20190325T131140_g9l2yddm_.bkp
2,211,840 100% 3.89MB/s 0:00:00 (xfr#4, to-chk=0/22)

sent 2,354,479 bytes received 115 bytes 1,569,729.33 bytes/sec
total size is 717,881,344 speedup is 304.89
[oracle@ora12c-2 ~]$ rman target /

Recovery Manager: Release 12.2.0.1.0 - Production on Mon Mar 25 13:18:45 2019

Copyright (c) 1982, 2017, Oracle and/or its affiliates. All rights reserved.

connected to target database: ORCLCDB (DBID=2774882222, not open)

RMAN> catalog start with '/u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/' noprompt;

using target database control file instead of recovery catalog
searching for all files that match the pattern /u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/

List of Files Unknown to the Database
=====================================
File Name: /u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/backupset/2019_03_25/o1_mf_nnnd1_TAG20190325T131140_g9l2yddm_.bkp
File Name: /u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/backupset/2019_03_25/o1_mf_nnnd1_TAG20190325T130915_g9l2sx2j_.bkp
File Name: /u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/autobackup/2019_03_25/o1_mf_s_1003842582_g9l2tq67_.bkp
File Name: /u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/autobackup/2019_03_25/o1_mf_s_1003842726_g9l2z7b0_.bkp
cataloging files...
cataloging done

List of Cataloged Files
=======================
File Name: /u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/backupset/2019_03_25/o1_mf_nnnd1_TAG20190325T131140_g9l2yddm_.bkp
File Name: /u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/backupset/2019_03_25/o1_mf_nnnd1_TAG20190325T130915_g9l2sx2j_.bkp
File Name: /u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/autobackup/2019_03_25/o1_mf_s_1003842582_g9l2tq67_.bkp
File Name: /u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/autobackup/2019_03_25/o1_mf_s_1003842726_g9l2z7b0_.bkp

RMAN> recover database;

Starting recover at 25-MAR-19
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=27 device type=DISK
channel ORA_DISK_1: starting incremental datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
destination for restore of datafile 00001: /u01/app/oracle/oradata/orclcdb/system01.dbf
destination for restore of datafile 00003: /u01/app/oracle/oradata/orclcdb/sysaux01.dbf
destination for restore of datafile 00004: /u01/app/oracle/oradata/orclcdb/undotbs01.dbf
destination for restore of datafile 00007: /u01/app/oracle/oradata/orclcdb/users01.dbf
channel ORA_DISK_1: reading from backup piece /u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/backupset/2019_03_25/o1_mf_nnnd1_TAG20190325T130915_g9l2sx2j_.bkp
channel ORA_DISK_1: piece handle=/u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/backupset/2019_03_25/o1_mf_nnnd1_TAG20190325T130915_g9l2sx2j_.bkp tag=TAG20190325T130915
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:03
channel ORA_DISK_1: starting incremental datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
destination for restore of datafile 00001: /u01/app/oracle/oradata/orclcdb/system01.dbf
destination for restore of datafile 00003: /u01/app/oracle/oradata/orclcdb/sysaux01.dbf
destination for restore of datafile 00004: /u01/app/oracle/oradata/orclcdb/undotbs01.dbf
destination for restore of datafile 00007: /u01/app/oracle/oradata/orclcdb/users01.dbf
channel ORA_DISK_1: reading from backup piece /u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/backupset/2019_03_25/o1_mf_nnnd1_TAG20190325T131140_g9l2yddm_.bkp
channel ORA_DISK_1: piece handle=/u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/backupset/2019_03_25/o1_mf_nnnd1_TAG20190325T131140_g9l2yddm_.bkp tag=TAG20190325T131140
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01

starting media recovery

unable to find archived log
archived log thread=1 sequence=2
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of recover command at 03/25/2019 13:19:15
RMAN-06054: media recovery requesting unknown archived log for thread 1 with sequence 2 and starting SCN of 1448493

RMAN> exit

Recovery Manager complete.

Veja o erro do RMAN no final do comando RECOVER. Ele diz que há um archive log desconhecido para a sequência 2. Isto ocorre porque as últimas alterações na origem não foram arquivadas ainda, estão apenas nos redo log files.

Logo, este erro é esperado e pode ser ignorado. Ao final do processo nós vamos aplicar todas as alterações e nenhum dado será perdido. Esse erro poderia ser evitado executando o comando RECOVER com UNTIL SEQUENCE.

11. Inicie a janela de mudança: baixe o banco na origem e execute o último backup incremental

Ok, então o banco de dados destino foi sendo atualizado por dias ou semanas através de backups incrementais, e agora é a hora de fazer os passos finais para que este novo banco esteja disponível para os usuários no novo servidor, substituindo o antigo.

Primeiro, vamos simular mais alterações ocorridas no banco origem criando uma nova tabela TABLE3.

[oracle@ora12c-1 ~]$ sqlplus / as sysdba

SQL*Plus: Release 12.2.0.1.0 Production on Mon Mar 25 13:28:17 2019

Copyright (c) 1982, 2016, Oracle. All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

SQL> create table TABLE3 as select * from dba_objects;

Table created.

SQL> exit
Disconnected from Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

Então, a mudança começa baixando o banco na origem e fazendo o último backup incremental:

[oracle@ora12c-1 ~]$ rman target /

Recovery Manager: Release 12.2.0.1.0 - Production on Mon Mar 25 13:29:55 2019

Copyright (c) 1982, 2017, Oracle and/or its affiliates. All rights reserved.

connected to target database: ORCLCDB (DBID=2774882222)

RMAN> shutdown immediate

using target database control file instead of recovery catalog
database closed
database dismounted
Oracle instance shut down

RMAN> startup mount

connected to target database (not started)
Oracle instance started
database mounted

Total System Global Area 419430400 bytes

Fixed Size 8793496 bytes
Variable Size 297796200 bytes
Database Buffers 109051904 bytes
Redo Buffers 3788800 bytes

RMAN> backup incremental level 1 as compressed backupset database;

Starting backup at 25-MAR-19
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=35 device type=DISK
channel ORA_DISK_1: starting compressed incremental level 1 datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00001 name=/u01/app/oracle/oradata/orclcdb/system01.dbf
input datafile file number=00003 name=/u01/app/oracle/oradata/orclcdb/sysaux01.dbf
input datafile file number=00004 name=/u01/app/oracle/oradata/orclcdb/undotbs01.dbf
input datafile file number=00007 name=/u01/app/oracle/oradata/orclcdb/users01.dbf
channel ORA_DISK_1: starting piece 1 at 25-MAR-19
channel ORA_DISK_1: finished piece 1 at 25-MAR-19
piece handle=/u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/backupset/2019_03_25/o1_mf_nnnd1_TAG20190325T133040_g9l420t6_.bkp tag=TAG20190325T133040 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:25
channel ORA_DISK_1: starting compressed incremental level 1 datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00010 name=/u01/app/oracle/oradata/orclcdb/pdb1/sysaux01.dbf
skipping datafile 00010 because it has not changed
input datafile file number=00009 name=/u01/app/oracle/oradata/orclcdb/pdb1/system01.dbf
skipping datafile 00009 because it has not changed
input datafile file number=00011 name=/u01/app/oracle/oradata/orclcdb/pdb1/users01.dbf
skipping datafile 00011 because it has not changed
channel ORA_DISK_1: backup cancelled because all files were skipped
channel ORA_DISK_1: starting compressed incremental level 1 datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00006 name=/u01/app/oracle/oradata/orclcdb/pdbseed/sysaux01.dbf
skipping datafile 00006 because it has not changed
input datafile file number=00005 name=/u01/app/oracle/oradata/orclcdb/pdbseed/system01.dbf
skipping datafile 00005 because it has not changed
channel ORA_DISK_1: backup cancelled because all files were skipped
Finished backup at 25-MAR-19

Starting Control File and SPFILE Autobackup at 25-MAR-19
piece handle=/u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/autobackup/2019_03_25/o1_mf_s_1003843810_g9l42thh_.bkp comment=NONE
Finished Control File and SPFILE Autobackup at 25-MAR-19

RMAN> exit

Recovery Manager complete.

12. Copie os novos arquivos de backup para o destino

[oracle@ora12c-1 ~]$ rsync --progress -avz -e "ssh -i ~/EC2us-east-1.pem" /u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/ oracle@ora12c-2:/u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/
sending incremental file list
autobackup/2019_03_25/
autobackup/2019_03_25/o1_mf_s_1003843810_g9l42thh_.bkp
18,825,216 100% 107.32MB/s 0:00:00 (xfr#1, to-chk=5/24)
backupset/2019_03_25/
backupset/2019_03_25/o1_mf_nnnd1_TAG20190325T133040_g9l420t6_.bkp
2,768,896 100% 9.60MB/s 0:00:00 (xfr#2, to-chk=0/24)

sent 1,580,107 bytes received 77 bytes 1,053,456.00 bytes/sec
total size is 739,475,456 speedup is 467.97

13. Copie os Redo Log Files e o Control File final para o destino

Aqui está o “truque” que vai permitir abrir o banco sem RESETLOGS: nós vamos copiar os redo log files online e o control file mais atualizado para o destino.

Deixe-me explicar porque isto é necessário.

Inicialmente, o control file é o arquivo que o Oracle usa para controlar a última alteração (SCN) feita em um banco de dados. Então, se você tem o control file atual, isto é, não restaurado de um backup, certamente o último SCN está registrado nele e o Oracle conhece o número a última mudança.

Por outro lado, se você restaura um control file do backup, obviamente o SCN dentro dele é obsoleto e não server para identificar qual foi a última mudança. E, se o control file não tem a última mudança, ele precisa ser atualizado e isso é feito quando se abre o banco com RESETLOGS. Não importa se você conseguiu aplicar todas as alterações ou não, se o control file foi restaurado de um backup, o banco precisa ser aberto com RESETLOGS.

Vamos olhar o control file do destino neste momento:

[oracle@ora12c-2 ~]$ sqlplus / as sysdba

SQL*Plus: Release 12.2.0.1.0 Production on Mon Mar 25 13:43:36 2019

Copyright (c) 1982, 2016, Oracle. All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

SQL> select controlfile_type from v$database;

CONTROL
-------
BACKUP

SQL> exit
Disconnected from Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

Como pode ver, o banco destino tem seu control file marcado como de backup. O RMAN fez esta marcação.

Então, com este control file, o único jeito de abrir o banco é com RESETLOGS. Mas nós queremos evitar isso, e a maneira de conseguir é copiando o último control file da origem para o destino. Esta cópia precisa ser feita manualmente, porque todo control file gerado pelo RMAN será marcado como de backup.

Nós também precisamos copiar os redo log files porque eles contém as últimas alterações que ainda não foram arquivadas. Mesmo que você tenha certeza de que não há nenhuma alteração não arquivada, os redos são necessários porque o Oracle precisa verificar dentro deles e ter certeza de que todas as alterações foram aplicadas.

Antes de copiar os arquivos, vou baixar o banco no destino porque os control files serão substituídos.

[oracle@ora12c-2 ~]$ rman target /

Recovery Manager: Release 12.2.0.1.0 - Production on Mon Mar 25 13:54:17 2019

Copyright (c) 1982, 2017, Oracle and/or its affiliates. All rights reserved.

connected to target database: ORCLCDB (DBID=2774882222, not open)

RMAN> shutdown immediate

using target database control file instead of recovery catalog
database dismounted
Oracle instance shut down

RMAN> exit

Recovery Manager complete.

Agora vamos copiar os arquivos. Primeiro, deixe-me verificar quais são os control files e redo log files na origem:

[oracle@ora12c-1 ~]$ sqlplus / as sysdba

SQL*Plus: Release 12.2.0.1.0 Production on Mon Mar 25 13:56:01 2019

Copyright (c) 1982, 2016, Oracle. All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

SQL> select name from v$controlfile;

NAME
--------------------------------------------------------------------------------
/u01/app/oracle/oradata/orclcdb/control01.ctl
/u01/app/oracle/fast_recovery_area/orclcdb/control02.ctl

SQL> select member from v$logfile;

MEMBER
--------------------------------------------------------------------------------
/u01/app/oracle/oradata/orclcdb/redo03.log
/u01/app/oracle/oradata/orclcdb/redo02.log
/u01/app/oracle/oradata/orclcdb/redo01.log

SQL> exit
Disconnected from Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

Agora, vamos copiar os arquivos para o destino:

[oracle@ora12c-1 ~]$ scp -i ~/EC2us-east-1.pem /u01/app/oracle/oradata/orclcdb/control01.ctl oracle@ora12c-2:/u01/app/oracle/oradata/orclcdb/
control01.ctl 100% 18MB 117.7MB/s 00:00 
[oracle@ora12c-1 ~]$ scp -i ~/EC2us-east-1.pem /u01/app/oracle/fast_recovery_area/orclcdb/control02.ctl oracle@ora12c-2:/u01/app/oracle/fast_recovery_area/orclcdb/
control02.ctl 100% 18MB 117.8MB/s 00:00 
[oracle@ora12c-1 ~]$ scp -i ~/EC2us-east-1.pem /u01/app/oracle/oradata/orclcdb/redo01.log oracle@ora12c-2:/u01/app/oracle/oradata/orclcdb/
redo01.log 100% 200MB 87.7MB/s 00:02 
[oracle@ora12c-1 ~]$ scp -i ~/EC2us-east-1.pem /u01/app/oracle/oradata/orclcdb/redo02.log oracle@ora12c-2:/u01/app/oracle/oradata/orclcdb/
redo02.log 100% 200MB 67.1MB/s 00:02 
[oracle@ora12c-1 ~]$ scp -i ~/EC2us-east-1.pem /u01/app/oracle/oradata/orclcdb/redo03.log oracle@ora12c-2:/u01/app/oracle/oradata/orclcdb/
redo03.log 100% 200MB 66.9MB/s 00:02

14. Monte e recupere o banco no destino

Agora temos o último control file e os redo log files. Desta vez o comando RECOVER precisa completar com sucesso, aplicando todas as alterações e confirmando que a recuperação está completa:

[oracle@ora12c-2 ~]$ rman target /

Recovery Manager: Release 12.2.0.1.0 - Production on Mon Mar 25 14:01:54 2019

Copyright (c) 1982, 2017, Oracle and/or its affiliates. All rights reserved.

connected to target database (not started)

RMAN> startup mount

Oracle instance started
database mounted

Total System Global Area 419430400 bytes

Fixed Size 8793496 bytes
Variable Size 297796200 bytes
Database Buffers 109051904 bytes
Redo Buffers 3788800 bytes

RMAN> recover database;

Starting recover at 25-MAR-19
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=35 device type=DISK
channel ORA_DISK_1: starting incremental datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
destination for restore of datafile 00001: /u01/app/oracle/oradata/orclcdb/system01.dbf
destination for restore of datafile 00003: /u01/app/oracle/oradata/orclcdb/sysaux01.dbf
destination for restore of datafile 00004: /u01/app/oracle/oradata/orclcdb/undotbs01.dbf
destination for restore of datafile 00007: /u01/app/oracle/oradata/orclcdb/users01.dbf
channel ORA_DISK_1: reading from backup piece /u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/backupset/2019_03_25/o1_mf_nnnd1_TAG20190325T133040_g9l420t6_.bkp
channel ORA_DISK_1: piece handle=/u01/app/oracle/fast_recovery_area/orclcdb/ORCLCDB/backupset/2019_03_25/o1_mf_nnnd1_TAG20190325T133040_g9l420t6_.bkp tag=TAG20190325T133040
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01

starting media recovery
media recovery complete, elapsed time: 00:00:00

Finished recover at 25-MAR-19

Vamos confirmar que nosso control file não está mais marcado como backup:

RMAN> select controlfile_type from v$database;

CONTROL
-------
CURRENT

Agora o control file é reconhecido como atual, o que significa que o banco pode ser aberto SEM RESETLOGS, contanto que todos os data files estejam corretamente recuperados. Vamos verificar se há algum arquivo que precisa de recuperação:

RMAN> select * from v$recover_file;

no rows selected

Nada para ser recuperado.

15. Abra o banco no destino

Tudo está OK, então eu posso abrir o banco no servidor destino SEM USAR o RESETLOGS:

RMAN> alter database open;

Statement processed

RMAN> exit

Recovery Manager complete.

Apenas para confirmar que não perdemos nenhuma informação, vamos verificar se todas as tabelas criadas na origem para simular alterações existem no destino:

[oracle@ora12c-2 ~]$ sqlplus / as sysdba

SQL*Plus: Release 12.2.0.1.0 Production on Mon Mar 25 14:26:28 2019

Copyright (c) 1982, 2016, Oracle. All rights reserved.

Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

SQL> select count(*) from table1;

COUNT(*)
----------
72633

SQL> select count(*) from table2;

COUNT(*)
----------
72634

SQL> select count(*) from table3;

COUNT(*)
----------
72635

Show, está tudo OK.

16. Rollback para a origem

Contanto que o banco origem não tenha sido aberto depois dos passos para a transferência, você poderá fazer a operação reversa e voltar a utilizar o banco na origem. Para isso, basta repetir todos os passos começando pelo “Inicie a janela de mudança”, mas desta vez invertendo os papéis entre os bancos origem e destino.

É isso. Espero que gostem!