Introduction
One of the most terrifying thing for a Database administrator to come across is a corrupted database.
A database corruption can be defined as a problem associated with the improper storage of the 0’s and 1’s that you need to store on the disk in order to access your data.
Despite the usefulness of the relational databases, they are prone to corruption, which results in the inaccessibility of some or all the data in the database.
More than 95% of corruption happens due to hardware failure. Among the remaining 5% we have,
- Bugs in software itself
- Abrupt system shutdown while the database is opened for writing or reading.
- Changes in SQL account
- Virus infection
- Upgrading software also at times, results in the corruption of the database
This blog addresses a similar issue associated with corrupted databases and how we identified and solved the problem.
Problem understanding
Our organization has an on-premise GitLab server. We have configured GitLab to take daily backups, but due to database corruption the backups started failing with the following error.
The plan was to move the GitLab installation to a new server but since we could not take a backup of our existing installation, we were stuck.
Next Steps
In case of database corruption, usually the loss is limited to the last action of one user, i.e. a single change to data. When a user starts to change data and the change is interrupted — for example, because of network service loss or any other reason, then the database file is marked as corrupted. The file can be repaired, but sometimes we might lose data after repair.
We need to be extremely careful with the live instance and the live data. Experimentation should be done on a virtual machine before performing any actions on the live instance.
The following steps were taken by us,
- Virtual environment setup
- Identify corrupted record(s)
- Database backup without corrupted record(s)
- Import database on virtual machine
- Fix database relationship and constraints
- Export database from virtual environment
- Import database on live instance
- Upgrade GitLab
- Finally run tests
Virtual environment setup
Generally, having a virtual environment gives us the freedom to experiment. Its a good practice to run trials on a virtual machine until you are satisfied with the results. This gives us the confidence that things will work as expected on live instance too.
Now lets create a virtual environment of the live instance. Only the applications which are directly or indirectly related to your database should be installed.
- Install Vagrant
- Install Debian Stretch
- Install PostgreSQL
- Install phpPgAdmin (Optional) – Web-based administration tool for PostgreSQL
GitLab uses PostgreSQL database.
Install Vagrant
Go to Vagrant’s official site and download the deb file for Debian.
You can find the link here.
Install Debian stretch64
Document on how to install Debian Stretch can be found here.
Install PostgreSQL
Install phpPgAdmin
This step is optional, use any of your favorite PostgreSQL GUI tools.
Identify corrupted record(s)
Login into the live instance.
Become a
gitlab-psql
user:
Connect to the PostgreSQL CLI:
gitlabhq_production is the default database.
If you already know about the records which are corrupted then you can skip this step. Otherwise write a small script which selects all the records from each table, select query will fail if the table has corrupted data.
Note that we already knew that the corrupted records belong tomerge_request_diffs
table by looking at the backup command error message.
For example if the corrupted record is in the
merge_request_diffs
table, then we can write a small script to identify the row which is corrupted:
The above script basically select all the rows one by one, but will fail to select the row which is corrupted and will print the row index. Assume that selection failed at index 2045, this tells us that the record at index2045 is corrupted.
Repeat this process for all the tables if you don’t know exactly which table has corrupted records.
Database backup without corrupted record(s)
We were fine with losing these corrupted records, so the next step would be to take a backup without those records.
Select the record, and get it’s row / tuple ID.
Note that if all the columns were selected the query was failing, probably because the corrupted data is in one of the columns.
base_commit_sha – 8250ee35a6a8d0cd60b5056c7ac94736a048e7a5
head_commit_sha – b3f08056e93f281c2905f6185395787ae7ffaada
merge_request_id – 1091
id – 1933
Create a copy of
merge_request_diffs
table without the corrupted record i.e. skip id 1933.
This will create a copy of merge_request_diffs table without the corrupted record.
The primary key and the foreign key of merge_request_diffs are not copied tomerge_request_diffs_copy
.
Take the backup of the entire database without table merge_request_diffs. Backups will fail if there are corrupted records.
The above command will skip merge_request_diffs table while taking backup.
Import database on virtual machine
We will setup the backup we took from the previous step on the Vagrant machine.
Create a test database
Create a new database to fix the issues on our vagrant machine.
GitLab owner
Create a PostgreSQL role as gitlab to import the gitlab database.
Since the ownership of all the tables is given to gitlab role in the backup which was taken earlier.
Install pg_trgm extension
GitLab uses
pg_trgm
few extensions which are not installed by default.
Run the following command to install pg_trgm extension:
Import data to gitlab_test
Use psql command to import from the dump.
Note that git_lab_db_dump is the name of the file which contains the database backup from the live instance.
Fix database relationship and constraints
Fixes to database should happen on the vagrant machine.
Get constraints of corrupted table
Go back to the GitLab server and get the table schema of the corrupted table.
Use the below command to get the table schema:
Sample schema is shown below:
Make sure that merge_request_diffs_copy table has the same constraints as the originalmerge_request_diffs table.
Use the above queries to add indexes and foreign keys.
Be very careful with the queries, especially with SEQUENCES,
START WITH 11421 keyword determines what should be the auto increment primary key value for the next record which will be inserted.
Rename table name
Rename the table name from merge_request_diffs_copy to
merge_request_diffs
.
So now, we have the database backup without any corrupted records.
Testing
Install the same version of GitLab on vagrant machine and use gitlab_test database.
Trigger the backup script and confirm that the backups are now being taken properly.
Export database from virtual environment
Take the backup from your vagrant instance
Import database on live instance
Drop the database from live server
Create gitlabhq_production database again on live server
Install pg_trgm
extension
GitLab uses pg_trgm extension which will not be installed by default with PostgreSQL.
Import the data to gitlabhq_production
Use psql command to import all the records from the dump into gitlabhq_production database.
Upgrade GitLab
You can find document here on how to update GitLab.
Note that the upgrade process will automatically run the migrations over the database unless you decide to do it manually.
Finally run tests
After a successful upgrade of GitLab server we should run few tests on it to verify that everything works fine.
Verify that the server is up and running.
Confirm if the database integration is done properly.
Test and verify if the database backup script is able to take backups properly.
References
References were taken from the following blogs, forum posts and Wiki’s.
- https://www.stellarinfo.com/support/kb/index.php/article/common-sql-database-corruption-errors-causes-solutions
- https://dba.stackexchange.com/questions/127846/dumping-database-with-compressed-data-is-corrupt-errors
- http://dbmsmusings.blogspot.in/
- https://www.freelancinggig.com/blog/2017/07/22/mariadb-vs-mysql-vs-postgresql-depth-comparison/
- https://support.office.com/en-us/article/Compact-and-repair-a-database-6ee60f16-aed0-40ac-bf22-85fa9f4005b2