New

Backup Database to AWS S3

Notebook


SingleStore Notebooks

Backup Database to AWS S3

Intro

Introducing a powerful Python notebook designed to simplify performing database backups on schedule

What you will learn in this notebook:

  1. How to backup database to AWS S3 [SQL]

What benefits do you get out of using the notebook.

  1. This notebook explains how we use SingleStore secrets feature to provide the configuration parameters we can control backup process (either full or incremental) in scheduled environment

Questions?

Reach out to us through our forum.

Pre-requisites

We will need below parameters to proceed.

  1. To access AWS S3, we need AWS Access key ID,AWS Secret access key, Aws Session Token

  2. Database User should have 'BACKUP', 'OUTBOUND', 'PROCESS' grant

  3. S3 Path provided should not exist [ bucket should exists, remaining path will be created if not existing for initial backup]

Note:

  1. check user grants by running 'show grants'.

  2. S3 Path if not exists, will be created by singlestore.

  3. General format is 'database_name.backup'.

  4. AWS IAM user should have S3 read,   write access

Imports

In [1]:

1

import io

2

import logging

3

import time

4

import getpass

5

6

import singlestoredb as s2

7

from IPython.display import display, HTML

Variables

In [2]:

1

aws_key_id = None

2

aws_secret_key = None

3

aws_region = 'us-east-1'

4

s3_target_path = None

5

aws_session_token = None

6

is_incremental_backup = 'N'

Functions to display various alerts

In [3]:

1

def show_warn(warn_msg):

2

"""

3

Display a warning message in a formatted HTML alert box.

4

5

Parameters

6

----------

7

warn_msg : str

8

The warning message to display.

9

"""

10

display(HTML(f'''<div class="alert alert-block alert-warning">

11

<b class="fa fa-solid fa-exclamation-circle"></b>

12

<div>

13

<p><b>Action Required</b></p>

14

<p>{warn_msg}</p>

15

</div>

16

</div>'''))

17

18

def show_error(error_msg):

19

"""

20

Display an error message in a formatted HTML alert box.

21

22

Parameters

23

----------

24

error_msg : str

25

The error message to display.

26

"""

27

display(HTML(f'''<div class="alert alert-block alert-danger">

28

<b class="fa fa-solid fa-exclamation-triangle"></b>

29

<div>

30

<p><b>Error</b></p>

31

<p>{error_msg}</p>

32

</div>

33

</div>'''))

34

35

36

def show_success(success_msg):

37

"""

38

Display a success message in a formatted HTML alert box.

39

40

Parameters

41

----------

42

success_msg : str

43

The success message to display.

44

"""

45

display(HTML(f'''<div class="alert alert-block alert-success">

46

<b class="fa fa-solid fa-check-circle"></b>

47

<div>

48

<p><b>Success</b></p>

49

<p>{success_msg}</p>

50

</div>

51

</div>'''))

LogControl

Note

To enable logs

  • Modify 'enable_debug_log(False)' to 'enable_debug_log(True)' in code below

In [4]:

1

def enable_debug_log(enabled):

2

if enabled:

3

logging.getLogger().setLevel(logging.DEBUG)

4

else:

5

logging.getLogger().setLevel(logging.ERROR)

Utility functions for handling S3 PATHs, SQL Statement, backup

In [5]:

1

def get_bkp_path(s3_path, db_name, do_increment):

2

"""

3

Get the backup path based on the type of backup (incremental or initial).

4

5

Parameters

6

----------

7

s3_path : str

8

The base S3 path for backups.

9

db_name : str

10

The name of the database.

11

do_increment: str

12

'Y' for incremental backup, else Full Backup

13

14

Returns

15

-------

16

str

17

The final backup path.

18

19

"""

20

if do_increment == 'Y':

21

logging.info('Is an incremental backup, will use exact path')

22

return s3_path

23

else:

24

logging.info('Is an initial backup, will use time appended path')

25

t = time.localtime(time.time())

26

my_path = f'{s3_path}/{db_name}/{t.tm_year}-{t.tm_mon:02d}-{t.tm_mday:02d}/{t.tm_hour:02d}{t.tm_min:02d}{t.tm_sec:02d}/'

27

logging.info(f'Backup Path : {my_path}')

28

print(f'Backup Path : {my_path}')

29

return my_path

30

31

32

33

def get_sql_statement(db_name_to_bkp, is_incremental_backup):

34

"""

35

Get the SQL statement for backing up a database.

36

37

Parameters

38

----------

39

db_name_to_bkp : str

40

The name of the database to backup.

41

is_incremental_backup : str

42

is incremental backup.

43

44

Returns

45

-------

46

str

47

The SQL statement for backup.

48

49

"""

50

global aws_key_id, aws_secret_key, aws_region, s3_target_path, aws_session_token

51

aws_key_id = (input('Enter AWS_API_KEY_ID:') if aws_key_id == None else aws_key_id)

52

aws_secret_key = (getpass.getpass('Enter AWS_API_SECRET:') if aws_secret_key == None else aws_secret_key)

53

aws_region = (input('Enter AWS_REGION:') if aws_region == None else aws_region)

54

s3_target_path = (input('Enter AWS S3 Path:') if s3_target_path == None else s3_target_path)

55

aws_session_token = (input('Enter AWS_SESSION_TOKEN:') if aws_session_token == None else aws_session_token)

56

57

data = io.StringIO()

58

data.write('BACKUP DATABASE ' + db_name_to_bkp + ' ')

59

if is_incremental_backup == 'Y':

60

data.write(' WITH DIFFERENTIAL ')

61

else:

62

data.write(' WITH INIT ')

63

data.write(' TO S3 "' + get_bkp_path(s3_target_path, db_name_to_bkp, is_incremental_backup) + '" ')

64

data.write(' CONFIG \'{"region":"' + aws_region + '"}\'')

65

data.write(' CREDENTIALS \'{"aws_access_key_id":"' + aws_key_id

66

+ '","aws_secret_access_key":"' + aws_secret_key + '"' )

67

if aws_session_token != '':

68

data.write(', "aws_session_token":"' + aws_session_token + '" ')

69

70

data.write('}\' ')

71

logging.debug(f'statement: {data.getvalue()}')

72

return data.getvalue()

73

74

75

def perform_backup(my_cursor, curr_db_name, do_incremental):

76

"""

77

Perform a database backup.

78

79

Parameters

80

----------

81

my_cursor : cursor

82

The database cursor.

83

curr_db_name : str

84

The name of the database to backup.

85

do_incremental: str

86

'Y' to perform incremental backup

87

88

"""

89

logging.debug(f'backing up db {curr_db_name}')

90

my_cursor.execute(get_sql_statement(curr_db_name, do_incremental))

91

results = cursor.fetchall()

92

if results is None:

93

logging.error('Backup execution failed')

94

else:

95

logging.info("Backup completed")

In [6]:

1

enable_debug_log(True)

2

print(connection_url)

3

try:

4

if connection_url.endswith('/') or connection_url.endswith('information_schema'):

5

#Hanlde case when database not selected or information_schema selected

6

#Connect to information schema and backup all databases

7

logging.debug('No database selected, will use information_schema and back up all databases')

8

my_db_url = connection_url

9

if connection_url.endswith('/'):

10

my_db_url = my_db_url + 'information_schema'

11

logging.debug(f'connection url updated {my_db_url}')

12

13

conn = s2.connect(my_db_url, results_type='dict')

14

with conn.cursor() as cursor:

15

# Get a list of databases to backup

16

cursor.execute(

17

"SELECT schema_name FROM information_schema.schemata WHERE schema_name NOT IN ( 'cluster', 'memsql', 'information_schema' );")

18

for row in cursor.fetchall():

19

logging.debug(f"processing db {row['schema_name']}")

20

# Backup each database

21

perform_backup(my_cursor=cursor, curr_db_name=row['schema_name'], do_incremental='N')

22

logging.debug(f"processing db {row['schema_name']} complete")

23

24

else:

25

#Connect to selected database and take its backup

26

database_to_bkp = connection_url.split('/')[-1]

27

28

# Establish a connection to the database

29

conn = s2.connect(results_type='dict')

30

with conn.cursor() as cursor:

31

perform_backup(my_cursor=cursor, curr_db_name=database_to_bkp, do_incremental=is_incremental_backup)

32

33

show_success('Backup Process Completed')

34

35

except s2.exceptions.OperationalError as ope:

36

# Handle specific operational errors

37

if 'NoSuchBucket' in ope.errmsg:

38

logging.error('Provided S3 Bucket does not exist. Please check.')

39

show_error('Provided S3 Bucket does not exist. Please check.')

40

elif 'Access denied' in ope.errmsg:

41

logging.error('Failed to backup due to missing grants or firewall settings. Please check.')

42

show_error('Failed to backup due to missing grants or firewall settings. Please check.')

43

else:

44

logging.error(f'Failed. Error message: {ope.errmsg}')

45

show_error(f'Failed to backup. {ope.errmsg}')

46

except s2.Error as e:

47

# Handle any other errors

48

print(f'Encountered exception {e}')

49

logging.exception(e)

50

show_error(f'Failed to backup. {str(e)}')

51

52

print('\n\nScript execution completed')

Verify Result

If script executed without errors. please check the S3 bucket for uploaded files ( Backup Path is printed to console )

General format is 'database_name.backup' or 'database_name.incr_backup'.

You may use below query to check backups created ( apply filter to limit data as per your needs )

select * from information_schema.MV_BACKUP_HISTORY

Important Note

  • To use this as scheduled notebook, we have to modify to read configuration data from table instead of user input

Details


About this Template

Performs backup either single or multiple databases to AWS S3.

Notebook Icon

This Notebook can be run in Standard and Enterprise deployments.

Tags

starterbackup

License

This Notebook has been released under the Apache 2.0 open source license.

See Notebook in action

Launch this notebook in SingleStore and start executing queries instantly.