Tuesday, October 19, 2010

Panther: oninit -i ... ups... too late... or not / Panther: oninit -i ... ups... tarde de mais... ou não

This article is written in English and Portuguese
Este artigo está escrito em Inglês e Português

English version:

I hope that this one will be quick... How many of us have tried to initialize (oninit -i) an already initialized instance by mistake? Personally I don't think I did it, but our mind tends to erase bad experiences :) But we have heard too many stories like this. A problem in the environment setup and this can easily happen.
Well, the good folks from R&D tried to keep us safe from ourselves by introducing a new parameter called FULL_DISK_INIT. It's something that magically appears in the $ONCONFIG file with the value of 0, or that simply is not there... It's absence, or the value 0, means that if you run oninit -i and there is already an informix page in the rootdbs chunk, it will fail. Let's see an example:


panther@pacman.onlinedomus.net:fnunes-> onstat -V
IBM Informix Dynamic Server Version 11.70.UC1 Software Serial Number AAA#B000000
panther@pacman.onlinedomus.net:fnunes-> onstat -
shared memory not initialized for INFORMIXSERVER 'panther'
panther@pacman.onlinedomus.net:fnunes-> onstat -c | grep FULL_DISK_INIT
FULL_DISK_INIT 0
panther@pacman.onlinedomus.net:fnunes-> oninit -i

This action will initialize IBM Informix Dynamic Server;
any existing IBM Informix Dynamic Server databases will NOT be accessible -
Do you wish to continue (y/n)? y

WARNING: server initialization failed, or possibly timed out (if -w was used).
Check the message log, online.log, for errors.
panther@pacman.onlinedomus.net:fnunes-> onstat -m
shared memory not initialized for INFORMIXSERVER 'panther'

Message Log File: /usr/informix/logs/panther.log
Wed Oct 20 00:02:10 2010

00:02:10 Warning: ONCONFIG dump directory (DUMPDIR) '/usr/informix/dumps' has insecure permissions
00:02:10 Event alarms enabled. ALARMPROG = '/home/informix/etc/alarm.sh'
00:02:13 Booting Language from module <>
00:02:13 Loading Module
00:02:13 Booting Language from module <>
00:02:13 Loading Module
00:02:19 DR: DRAUTO is 0 (Off)
00:02:19 DR: ENCRYPT_HDR is 0 (HDR encryption Disabled)
00:02:19 Event notification facility epoll enabled.
00:02:19 IBM Informix Dynamic Server Version 11.70.UC1 Software Serial Number AAA#B000000
00:02:20 DISK INITIALIZATION ABORTED: potential instance overwrite detected.
To disable this check, set FULL_DISK_INIT to 1 in your config file and retry.

00:02:20 oninit: Fatal error in shared memory initialization

00:02:20 IBM Informix Dynamic Server Stopped.

00:02:20 mt_shm_remove: WARNING: may not have removed all/correct segments

Very nice. It didn't allow me to shoot myself in the foot.
And if we don't have it in the $ONCONFIG?:


panther@pacman.onlinedomus.net:fnunes-> vi $INFORMIXDIR/etc/$ONCONFIG
panther@pacman.onlinedomus.net:fnunes-> onstat -
shared memory not initialized for INFORMIXSERVER 'panther'
panther@pacman.onlinedomus.net:fnunes-> onstat -c | grep FULL_DISK_INIT
#FULL_DISK_INIT 0
panther@pacman.onlinedomus.net:fnunes-> oninit -i

This action will initialize IBM Informix Dynamic Server;
any existing IBM Informix Dynamic Server databases will NOT be accessible -
Do you wish to continue (y/n)? y

WARNING: server initialization failed, or possibly timed out (if -w was used).
Check the message log, online.log, for errors.
panther@pacman.onlinedomus.net:fnunes-> onstat -m
shared memory not initialized for INFORMIXSERVER 'panther'

Message Log File: /usr/informix/logs/panther.log
The default memory page size will be used.
00:06:43 Segment locked: addr=0x44000000, size=224858112

Wed Oct 20 00:06:44 2010

00:06:44 Warning: ONCONFIG dump directory (DUMPDIR) '/usr/informix/dumps' has insecure permissions
00:06:44 Event alarms enabled. ALARMPROG = '/home/informix/etc/alarm.sh'
00:06:44 Booting Language from module <>
00:06:44 Loading Module
00:06:44 Booting Language from module <>
00:06:44 Loading Module
00:06:50 DR: DRAUTO is 0 (Off)
00:06:50 DR: ENCRYPT_HDR is 0 (HDR encryption Disabled)
00:06:50 Event notification facility epoll enabled.
00:06:50 IBM Informix Dynamic Server Version 11.70.UC1 Software Serial Number AAA#B000000
00:06:52 DISK INITIALIZATION ABORTED: potential instance overwrite detected.
To disable this check, set FULL_DISK_INIT to 1 in your config file and retry.

00:06:52 oninit: Fatal error in shared memory initialization

panther@pacman.onlinedomus.net:fnunes->

The same. So if I'm trying to configure a second instance and I point the ROOTPATH to an existing one I'm safe.... But this raises one question: How can I really re-initialize an instance? I know what I'm doing, so let me work!... It's simple... If you really know what you're doing, set it to 1:


panther@pacman.onlinedomus.net:fnunes-> vi $INFORMIXDIR/etc/$ONCONFIG
panther@pacman.onlinedomus.net:fnunes-> onstat -c | grep FULL_DISK_INIT
FULL_DISK_INIT 1
panther@pacman.onlinedomus.net:fnunes-> oninit -i

This action will initialize IBM Informix Dynamic Server;
any existing IBM Informix Dynamic Server databases will NOT be accessible -
Do you wish to continue (y/n)? y
panther@pacman.onlinedomus.net:fnunes-> onstat -

IBM Informix Dynamic Server Version 11.70.UC1 -- On-Line -- Up 00:00:32 -- 369588 Kbytes

panther@pacman.onlinedomus.net:fnunes-> onstat -c | grep FULL_DISK_INIT
FULL_DISK_INIT 0
panther@pacman.onlinedomus.net:fnunes->

Perfect! It allowed me to initialize it, and immediately changed the FULL_DISK_INIT parameter to 0 to keep me safe again.
This has been in the feature request list for years. Now that it's implemented we should be jumping up and down in plain happiness... But I'm not. Why? Because instead of sending the deserved compliments to R&D for implementing this I want more!
This is terribly useful, and will save a lot of people from destroying their instances. But unfortunately I've seen many other cases of destruction that can't be avoided by this. A few examples:

  1. A chunk allocation for a second instance on the same machine (using RAW devices) overwrites another already used chunk from another instance
  2. A restore of an instance overwrites another (either fully or partially)
  3. A restore of an instance on the same machine using the rename chunks functionality uses an outdated rename chunks file (-rename -f FILE ontape option). This file doesn't have a few chunks that were recently added. So these chunks will be restored over the existing chunks!
So, what would make me jump would be something that covered all these scenarios. It would not be a simple ONCONFIG parameter and a change in oninit. It would require changes in more utilities and server components (onspaces, SQL Admin API, ontape, onbar...), but that would really keep us safe from our mistakes. For now this is a good sign, and if these questions worry you, be alert and if you have the chance make IBM know that it is important to you.

One instance was destroyed to bring this article to you... I'll spend another 30s to get the data back into it :)



Versão Portuguesa:

Espero que este seja rápido... Quantos de nós já tentámos inicializar (oninit -i/iy) uma instância já inicializada por engano? Pessoalmente não me recordo de me ter acontecido, mas a nossa mente tende a apagar episódios traumáticos :) Mas já ouvimos demasiadas estórias como esta. Basta um problema na configuração de um ambiente e isto pode acontecer facilmente.
Bem, os bons rapazes do desenvolvimento tentaram manter-nos a salvo de nós mesmos, através da introdução de um novo parâmetro chamado FULL_DISK_INIT. É algo que aparece magicamente no nosso $ONCONFIG com o valor 0, ou que simplesmente não está lá... A sua ausência ou o valor 0 significam que se tentarmos correr o oninit -i e já existir uma página Informix no nosso chunk do rootdbs irá falhar. Vejamos um exemplo:


panther@pacman.onlinedomus.net:fnunes-> onstat -V
IBM Informix Dynamic Server Version 11.70.UC1 Software Serial Number AAA#B000000
panther@pacman.onlinedomus.net:fnunes-> onstat -
shared memory not initialized for INFORMIXSERVER 'panther'
panther@pacman.onlinedomus.net:fnunes-> onstat -c | grep FULL_DISK_INIT
FULL_DISK_INIT 0
panther@pacman.onlinedomus.net:fnunes-> oninit -i

This action will initialize IBM Informix Dynamic Server;
any existing IBM Informix Dynamic Server databases will NOT be accessible -
Do you wish to continue (y/n)? y

WARNING: server initialization failed, or possibly timed out (if -w was used).
Check the message log, online.log, for errors.
panther@pacman.onlinedomus.net:fnunes-> onstat -m
shared memory not initialized for INFORMIXSERVER 'panther'

Message Log File: /usr/informix/logs/panther.log
Wed Oct 20 00:02:10 2010

00:02:10 Warning: ONCONFIG dump directory (DUMPDIR) '/usr/informix/dumps' has insecure permissions
00:02:10 Event alarms enabled. ALARMPROG = '/home/informix/etc/alarm.sh'
00:02:13 Booting Language from module <>
00:02:13 Loading Module
00:02:13 Booting Language from module <>
00:02:13 Loading Module
00:02:19 DR: DRAUTO is 0 (Off)
00:02:19 DR: ENCRYPT_HDR is 0 (HDR encryption Disabled)
00:02:19 Event notification facility epoll enabled.
00:02:19 IBM Informix Dynamic Server Version 11.70.UC1 Software Serial Number AAA#B000000
00:02:20 DISK INITIALIZATION ABORTED: potential instance overwrite detected.
To disable this check, set FULL_DISK_INIT to 1 in your config file and retry.

00:02:20 oninit: Fatal error in shared memory initialization

00:02:20 IBM Informix Dynamic Server Stopped.

00:02:20 mt_shm_remove: WARNING: may not have removed all/correct segments


Muito bem. Não me deixou dar um tiro no pé.
E se não tivermos o parâmetro no $ONCONFIG?:

panther@pacman.onlinedomus.net:fnunes-> vi $INFORMIXDIR/etc/$ONCONFIG
panther@pacman.onlinedomus.net:fnunes-> onstat -
shared memory not initialized for INFORMIXSERVER 'panther'
panther@pacman.onlinedomus.net:fnunes-> onstat -c | grep FULL_DISK_INIT
#FULL_DISK_INIT 0
panther@pacman.onlinedomus.net:fnunes-> oninit -i

This action will initialize IBM Informix Dynamic Server;
any existing IBM Informix Dynamic Server databases will NOT be accessible -
Do you wish to continue (y/n)? y

WARNING: server initialization failed, or possibly timed out (if -w was used).
Check the message log, online.log, for errors.
panther@pacman.onlinedomus.net:fnunes-> onstat -m
shared memory not initialized for INFORMIXSERVER 'panther'

Message Log File: /usr/informix/logs/panther.log
The default memory page size will be used.
00:06:43 Segment locked: addr=0x44000000, size=224858112

Wed Oct 20 00:06:44 2010

00:06:44 Warning: ONCONFIG dump directory (DUMPDIR) '/usr/informix/dumps' has insecure permissions
00:06:44 Event alarms enabled. ALARMPROG = '/home/informix/etc/alarm.sh'
00:06:44 Booting Language from module <>
00:06:44 Loading Module
00:06:44 Booting Language from module <>
00:06:44 Loading Module
00:06:50 DR: DRAUTO is 0 (Off)
00:06:50 DR: ENCRYPT_HDR is 0 (HDR encryption Disabled)
00:06:50 Event notification facility epoll enabled.
00:06:50 IBM Informix Dynamic Server Version 11.70.UC1 Software Serial Number AAA#B000000
00:06:52 DISK INITIALIZATION ABORTED: potential instance overwrite detected.
To disable this check, set FULL_DISK_INIT to 1 in your config file and retry.

00:06:52 oninit: Fatal error in shared memory initialization

panther@pacman.onlinedomus.net:fnunes->

Acontece o mesmo. Portanto de estiver a tentar configurar uma nova instância e por lapso apontar o ROOTPATH para outra já existente estou salvo... Mas isto levanta uma questao: Como posso re-inicializar uma instância? Eu sei o que estou a fazer, por isso deixem-me trabalhar!... É simples... Se sabe realmente o que está a fazer só tem de o definir para 1:


panther@pacman.onlinedomus.net:fnunes-> vi $INFORMIXDIR/etc/$ONCONFIG
panther@pacman.onlinedomus.net:fnunes-> onstat -c | grep FULL_DISK_INIT
FULL_DISK_INIT 1
panther@pacman.onlinedomus.net:fnunes-> oninit -i

This action will initialize IBM Informix Dynamic Server;
any existing IBM Informix Dynamic Server databases will NOT be accessible -
Do you wish to continue (y/n)? y
panther@pacman.onlinedomus.net:fnunes-> onstat -

IBM Informix Dynamic Server Version 11.70.UC1 -- On-Line -- Up 00:00:32 -- 369588 Kbytes

panther@pacman.onlinedomus.net:fnunes-> onstat -c | grep FULL_DISK_INIT
FULL_DISK_INIT 0
panther@pacman.onlinedomus.net:fnunes->

Perfeito. Deixou-me inicializar e imediatamente mudou o parâmetro FULL_DISK_INIT para 0 para me salvaguardar de novo.

Isto estava na lista de pedidos de coisas a implementar há anos. Agora que está implementado devíamos estar aos saltos de contentamento... Mas eu não estou. Porquê? Porque em vez de enviar os merecidos cumprimentos ao desenvolvimento quero mais!
Isto é tremendamente útil, e vai salvar muita gente de destruir as suas instâncias. Mas infelizmente eu tenho visto muitos outros casos de destruição que não podem ser evitados por isto. Alguns exemplos:
  1. Uma criação de um chunk para uma segunda instância na mesma máquina (usando RAW devices) sobrepõe outro chunk já em uso noutra instância
  2. Uma reposição de um backup sobrepõe outra instância (completa ou parcialmente)
  3. Uma reposição de um backup de uma instância, na mesma máquina, usando a funcionalidade de troca de paths dos chunks usa um ficheiro de rename desactualizado (opção -rename -f FICHEIRO do ontape). Este ficheiro não contém alguns chunks que foram adicionados recentemente. Portanto estes chunks serão restaurados sobre os existentes!
Assim, o que me deixaria aos pulos de contentamento seria algo que cobrisse todos estes cenários. Não seria tão simples quanto uma mudança no ONCONFIG e uma mudança no oninit. Requeriria mudanças em mais utilitários e componentes do servidor (onspaces, SQL Admin API, ontape, onbar....), mas isto sim, conseguiria manter-nos a salvo dos nossos erros. Por agora, esta funcionalidade é um bom sinal, e se estas questões o preocupam, esteja alerta e se tiver oportunidade faça com que a IBM saiba que isto é importante para si.

Uma instância foi destruída para fazer chegar este artigo até a si. Agora vou passar mais 30s a repôr-lhe os dados :)

2 comments:

Unknown said...

Oi Fernando, cumprimentos do Paraguai (sempre ler o seu blog, acho que desde que apareceu IDS 11.10).

Excelente notícia, eu sei que alguns clientes que têm gerado sexta-feira 13, na ausência deste parâmetro na configuração informix 9 ou 10, então é bom e os aplausos.

Obrigado e desculpe pelo meu Português muito ruim, jajaja

Javier Gray.

Unknown said...

Oi Fernando, gosto de novo, eu escrevo desta forma (não encontrei outro meio), a fim de pedir permissão para traduzir artigos deseu blog, o castelhano e publicarlos em um blog que eu abri há poucos dias, convido a visitar e deixar sua crítica também.

Agradeço a deferência

Brigado,

Xavi.