Python Script Tutorial: Download ABI Level 2 Data Files from Amazon Web Services (AWS)

ABI Level 2 data files are freely available from NOAA's GOES archive on AWS.

Using Jupyter Notebook format, this tutorial demonstrates how to:

  1. Enter parameters to search for any ABI Level 2 data files on AWS for an observation day and time period of interest, including:
    • Satellite (GOES-16 or GOES-17)
    • ABI Level 2 product
    • ABI scan sector (Full Disk, CONUS, Meso 1, Meso 2)
    • Observation Year
    • Observation Month
    • Observation Day
    • Observation Start Time (UTC)
    • Observation End Time (UTC)
  2. Query AWS for available files matching the user-entered search parameters
  3. Download the available data files to the user's local computer/server

Please acknowledge the NOAA/NESDIS/STAR Aerosols and Atmospheric Composition Science Team if using any of this code in your work/research!


Import Python packages

The first step is to import all of the Python packages we need for the entire script. We will use the S3Fs library to access the AWS S3 using anonymous credentials.

# Import Python packages

# Library to perform array operations
import numpy as np

# Module to interface with Amazon Simple Storage Service (S3)
import s3fs

# Module for manipulating dates and times
import datetime

# Library to create progress bars for loops/functions
from tqdm import tqdm

# Module for accessing system-specific parameters and functions
import sys

# Library to access core utilities for Python packages
from packaging.version import parse

# Module to set filesystem paths appropriate for user's operating system
from pathlib import Path

# Modules to create interactive menus in Jupyter Notebook
from IPython.display import display
import ipywidgets as widgets

Enter search parameters using Jupyter Widgets menus

We need a way for the user to enter the parameters for their AWS search. In Jupyter Notebook, we can use Jupyter widgets to make user-friendly GUI pull-down menus for entering search variables.

First, run this code block to generate the interactive menus. Then, use the menus to select the satellite, Level 2 product, ABI scan sector, and observation year/month/day and start/end time for the AWS search. The search parameters are input to the rest of the script from the menus via the main function, by reading the ".value" of each menu variable.

It is only necessary to run the widgets code block once to generate the menus. Subsequently, selections in the menus can be changed, and multiple consecutive AWS searches can be run via the main function, without re-running this widgets code. If this code block is re-run, it will reset all the menus to their default values, and the user will need to re-select the search parameters of interest again before running the main function.

# Enter satellite, ABI L2 product, view sector, observation date & start/end times for AWS search
# Selections are made using interactive Jupyter Notebook widgets
# Run this block *once* to generate menus
# When main function is run, it reads ".value" of each menu selection
# Do NOT re-run block if you change menu selections (re-running block resets menus to defaults)!

# Formatting settings for drop-down menus
style = {'description_width':'120px'}
layout = widgets.Layout(width='375px')

# Create drop-down menus using widgets
satellite = widgets.Dropdown(options=[('GOES-16', 16), ('GOES-17', 17)], description='Satellite:', style=style, layout=layout)
product = widgets.Dropdown(options=[('Aerosol Detection'), ('Aerosol Optical Depth'), ('Clear Sky Mask'), ('Cloud & Moisture Imagery'), ('Cloud & Moisture Imagery Multiband'), ('Cloud Optical Depth'), ('Cloud Particle Size'), ('Cloud Top Height'), ('Cloud Top Phase'), ('Cloud Top Pressure'), ('Cloud Top Temperature'), ('Derived Motion Winds'), ('Derived Stability Indices'), ('Downward Shortwave Radiation'), ('Fire Hotspot Characterization'), ('Land Surface Temperature'), ('Legacy Vertical Moisture Profile'), ('Legacy Vertical Temperature Profile'), ('Rainfall Rate/QPE'), ('Reflected Shortwave Radiation'), ('Sea Surface Temperature'), ('Total Precipitable Water'), ('Volcanic Ash')], description='Product:', style=style, layout=layout)
sector = widgets.Dropdown(options=[('Full Disk'), ('CONUS'), ('Meso 1'), ('Meso 2')], description='Scan Sector:', style=style, layout=layout)
year = widgets.Dropdown(options=[('2019', 2019), ('2020', 2020), ('2021', 2021), ('2022', 2022), ('2023', 2023), ('2024', 2024), ('2025', 2025)], description='Year:', style=style, layout=layout)
month = widgets.Dropdown(options=[('Jan', 1), ('Feb', 2), ('Mar', 3), ('Apr', 4), ('May', 5), ('Jun', 6), ('Jul', 7), ('Aug', 8), ('Sep', 9), ('Oct', 10), ('Nov', 11), ('Dec', 12)], description='Month:', style=style, layout=layout)
day = widgets.Dropdown(options=[('1', 1), ('2', 2), ('3', 3), ('4', 4), ('5', 5), ('6', 6), ('7', 7), ('8', 8), ('9', 9), ('10', 10), ('11', 11), ('12', 12), ('13', 13), ('14', 14), ('15', 15), ('16', 16), ('17', 17), ('18', 18), ('19', 19), ('20', 20), ('21', 21), ('22', 22), ('23', 23), ('24', 24), ('25', 25), ('26', 26), ('27', 27), ('28', 28), ('29', 29), ('30', 30), ('31', 31)], description='Day:', style=style, layout=layout)
shour = widgets.Dropdown(options=[('00'), ('01'), ('02'), ('03'), ('04'), ('05'), ('06'), ('07'), ('08'), ('09'), ('10'), ('11'), ('12'), ('13'), ('14'), ('15'), ('16'), ('17'), ('18'), ('19'), ('20'), ('21'), ('22'), ('23')], description='Start Hour (UTC):', style=style, layout=layout)
smin = widgets.Dropdown(options=[('00'), ('01'), ('02'), ('03'), ('04'), ('05'), ('06'), ('07'), ('08'), ('09'), ('10'), ('11'), ('12'), ('13'), ('14'), ('15'), ('16'), ('17'), ('18'), ('19'), ('20'), ('21'), ('22'), ('23'), ('24'), ('25'), ('26'), ('27'), ('28'), ('29'), ('30'), ('31'), ('32'), ('33'), ('34'), ('35'), ('36'), ('37'), ('38'), ('39'), ('40'), ('41'), ('42'), ('43'), ('44'), ('45'), ('46'), ('47'), ('48'), ('49'), ('50'), ('51'), ('52'), ('53'), ('54'), ('55'), ('56'), ('57'), ('58'), ('59')], description='Start Minutes (UTC):', style=style, layout=layout)
ehour = widgets.Dropdown(options=[('00'), ('01'), ('02'), ('03'), ('04'), ('05'), ('06'), ('07'), ('08'), ('09'), ('10'), ('11'), ('12'), ('13'), ('14'), ('15'), ('16'), ('17'), ('18'), ('19'), ('20'), ('21'), ('22'), ('23')], description='End Hour (UTC):', style=style, layout=layout)
emin = widgets.Dropdown(options=[('00'), ('01'), ('02'), ('03'), ('04'), ('05'), ('06'), ('07'), ('08'), ('09'), ('10'), ('11'), ('12'), ('13'), ('14'), ('15'), ('16'), ('17'), ('18'), ('19'), ('20'), ('21'), ('22'), ('23'), ('24'), ('25'), ('26'), ('27'), ('28'), ('29'), ('30'), ('31'), ('32'), ('33'), ('34'), ('35'), ('36'), ('37'), ('38'), ('39'), ('40'), ('41'), ('42'), ('43'), ('44'), ('45'), ('46'), ('47'), ('48'), ('49'), ('50'), ('51'), ('52'), ('53'), ('54'), ('55'), ('56'), ('57'), ('58'), ('59')], description='End Minutes (UTC):', style=style, layout=layout)

# Format observation start/end time hour and minutes menus to display side-by-side
start_time = widgets.HBox([shour, smin])
end_time = widgets.HBox([ehour, emin])

# Display drop-down menus
print('If you change menu selections (e.g., to run another search), do NOT re-run this block!\nRe-running will re-set all menus to their defaults!')
display(satellite, product, sector, year, month, day)
display(start_time, end_time)

The image below shows a screenshot of the output GUI, with the pull-down menus set to search for GOES-16 ABI CONUS sector aerosol optical depth (AOD) data files on April 4, 2022 at 18:00-18:15 UTC (click image to open full-size version).

Example of Jupyter widgets pull-down menus

Function to find Julian day

ABI Level 2 file names use the Julian day (day of the year) instead of the Gregorian month-day (day of the month). We create a function, called "find_julian( )", that uses the datetime module to find the Julian day corresponding to the user-entered observation year, month, and day. We will use the Julian day in our query to AWS.

# Find Julian day from user-specified observation year/month/day
# ABI data files are classified by Julian day; needed for AWS search
# "year", "month", "day": parameter variables from widget menus, set in main function

def find_julian(year, month, day):
    calendar = datetime.datetime(year, month, day)
    julian_day = calendar.strftime('%j')
    
    return julian_day

Function to get ABI Level 2 product/scan sector abbreviation

ABI Level 2 file names include an abbreviation consisting of 3-4 letters for the Level 2 product followed by a 1-letter abbreviation for the ABI scan sector. We create a function, called "get_product_abbreviation( )", that returns the abbreviation corresponding to the user-entered product and scan sector. We will use this abbreviation in our query to AWS.

Note that all products are not generated for all scan sectors. For example, aerosol optical depth (AOD) is not generated for the Meso 1 or Meso 2 sectors. If the user enters a scan sector for which a product is not generated, the product abbreviation will be returned as "None"; in the main function, we will use this to print a notification message for the user.

# Find ABI L2 product abbreviation from user-specified product/scan sector
# Abbreviation is part of ABI file name; needed for AWS search
# "sector", "product": parameter variables from widget menus, set in main function

def get_product_abbreviation(sector, product):
    
    # Define dictionary keys
    keys = ['Full Disk', 'CONUS', 'Meso 1', 'Meso 2']
    
    # Define dictionary values for each ABI L2 product 
    if product == 'Aerosol Detection':
        values = ['ABI-L2-ADPF', 'ABI-L2-ADPC', 'ABI-L2-ADPM', 'ABI-L2-ADPM']
    elif product == 'Aerosol Optical Depth':
        values = ['ABI-L2-AODF', 'ABI-L2-AODC', 'None', 'None']
    elif product == 'Clear Sky Mask':
        values = ['ABI-L2-ACMF', 'ABI-L2-ACMC', 'ABI-L2-ACMM', 'ABI-L2-ACMM']
    elif product == 'Cloud & Moisture Imagery':
        values = ['ABI-L2-CMIPF', 'ABI-L2-CMIPC', 'ABI-L2-CMIPM', 'ABI-L2-CMIPM']
    elif product == 'Cloud & Moisture Imagery Multiband':
        values = ['ABI-L2-MCMIPF', 'ABI-L2-MCMIPC', 'ABI-L2-MCMIPM', 'ABI-L2-MCMIPM']
    elif product == 'Cloud Optical Depth':
        values = ['ABI-L2-CODF', 'ABI-L2-CODC', 'None', 'None']
    elif product == 'Cloud Particle Size':
        values = ['ABI-L2-CPSF', 'ABI-L2-CPSC', 'ABI-L2-CPSM', 'ABI-L2-CPSM']
    elif product == 'Cloud Top Height':
        values = ['ABI-L2-ACHAF', 'ABI-L2-ACHAC', 'ABI-L2-ACHAM', 'ABI-L2-ACHAM']
    elif product == 'Cloud Top Phase':
        values = ['ABI-L2-ACTPF', 'ABI-L2-ACTPC', 'ABI-L2-ACTPM', 'ABI-L2-ACTPM']
    elif product == 'Cloud Top Pressure':
        values = ['ABI-L2-CTPF', 'ABI-L2-CTPC', 'None', 'None']
    elif product == 'Cloud Top Temperature':
        values = ['ABI-L2-ACHTF', 'None', 'ABI-L2-ACHTM', 'ABI-L2-ACHTM']
    elif product == 'Derived Motion Winds':
        values = ['ABI-L2-DMWF', 'ABI-L2-DMWC', 'ABI-L2-DMWM', 'ABI-L2-DMWM']
    elif product == 'Derived Stability Indices':
        values = ['ABI-L2-DSIF', 'ABI-L2-DSIC', 'ABI-L2-DSIM', 'ABI-L2-DSIM']
    elif product == 'Downward Shortwave Radiation':
        values = ['ABI-L2-DSRF', 'ABI-L2-DSRC', 'ABI-L2-DSRM', 'ABI-L2-DSRM']
    elif product == 'Fire Hotspot Characterization':
        values = ['ABI-L2-FDCF', 'ABI-L2-FDCC', 'ABI-L2-FDCM', 'ABI-L2-FDCM']
    elif product == 'Land Surface Temperature':
        values = ['ABI-L2-LSTF', 'ABI-L2-LSTC', 'ABI-L2-LSTM', 'ABI-L2-LSTM']
    elif product == 'Legacy Vertical Moisture Profile':
        values = ['ABI-L2-LVMPF', 'ABI-L2-LVMPC', 'ABI-L2-LVMPM', 'ABI-L2-LVMPM']
    elif product == 'Legacy Vertical Temperature Profile':
        values = ['ABI-L2-LVTPF', 'ABI-L2-LVTPC', 'ABI-L2-LVTPM', 'ABI-L2-LVTPM']
    elif product == 'Rainfall Rate/QPE':
        values = ['ABI-L2-RRQPEF', 'None', 'None', 'None']
    elif product == 'Reflected Shortwave Radiation':
        values = ['ABI-L2-RSRF', 'ABI-L2-RSRC', 'None', 'None']
    elif product == 'Sea Surface Temperature':
        values = ['ABI-L2-SSTF', 'None', 'None', 'None']
    elif product == 'Total Precipitable Water':
        values = ['ABI-L2-TPWF', 'ABI-L2-TPWC', 'ABI-L2-TPWM', 'ABI-L2-TPWM']
    elif product == 'Volcanic Ash':
        values = ['ABI-L2-VAAF', 'None', 'None', 'None']

    # Use list comprehension to combine "values" and "keys" lists
    abbreviation_dictionary = {keys[i]: values[i] for i in range(len(keys))}
    
    # Get product abbreviation for specified product and scan sector
    product_abbreviation = abbreviation_dictionary.get(sector)
    
    return product_abbreviation

Function to create list of available ABI Level 2 data file names

We create a function, called "aws_abi_list( )", that returns a list of the available ABI data file names matching the user-entered satellite/product and date/time period. There are multiple steps in this function.

First, we access AWS anonymously using the S3Fs library. No login ID or password is required!

Next, we make a list called "all_hours_list" that contains all the available ABI data file names for the user-entered satellite/product/scan sector and observation date, encompassing the full start and end hours of the observation time period. For example, if the observation start time is 16:15 and the end time is 20:30, "all_hours_list" will contain all the files from 16:00 to 20:59.

To populate "all_hours_list", we find the Julian day and product abbreviation using the "find_julian( )" and "get_product_abbreviation( )" functions we created. Then we use Python's "range( )" constructor to calculate the sequence of integers for the range of hours encompassing the user-entered observation time period ("hour_range"). In our example observation period of 16:15-20:30, the "hour_range" will be [16, 17, 18, 19, 20]. We loop through the "hour_range", using the S3Fs "ls( )" command to "list" the file names for each full hour in the observation time period, and then add the names to "all_hours_list" using the "extend( )" command.

The last step is to loop through the file names in "all_hours_list" and extract the file names that correspond to the exact time period entered by the user (in our example, 16:15-20:30) and put them in a new list called "data". We do this by comparing the start time in each ABI file name in "all_hours_list" to the user-entered observation time period. We use reverse indexing, counting from the end of the ABI file names, because the length of the beginning of ABI Level 2 file names varies depending on the "product_abbreviation".

# Create list containing ABI L2 data file names for user-specified satellite/product and date/time period
# "year", "month", "day, "start_hour", "start_min", "end_hour", "end_min", "satellite", "sector", 'product': parameter 
# variables from widget menus, set in main function

def aws_abi_list(year, month, day, start_hour, start_min, end_hour, end_min, satellite, sector, product):
    
    # Access AWS S3 using anonymous credentials
    aws = s3fs.S3FileSystem(anon=True)
    
    # Get all ABI L2 data file names encompassing user-specified satellite/product, date, and start/end hours
    julian_day = find_julian(year, month, day)
    product_abbreviation = get_product_abbreviation(sector, product)
    hour_range = range(int(start_hour), int(end_hour) + 1)
    all_hours_list = []
    for hour in hour_range:
        # Query AWS ABI archive for ABI L2 file names
        # "'{number:02d}'.format(number=hour)" adds leading zero to hours < 10 in hour_range array
        # "refresh=True" argument clears cache so NRT files on AWS ABI archive are retrievable
        hour_files = aws.ls('noaa-goes' + str(satellite) + '/' + product_abbreviation + '/' + str(year) + '/' + julian_day + '/' + '{number:02d}'.format(number=hour) + '/', refresh=True)
        all_hours_list.extend(hour_files)
    
    # Extract ABI L2 data file names for exact period set by user-specified observation start/end times
    # Use reverse indexing to count from end of ABI file names
    data = []
    for file in all_hours_list:
        # For Meso products, extract only file names for user-specified view sector (e.g., "Meso 1" or "Meso 2")
        if sector == 'Meso 1' or sector == 'Meso 2':
            # Extract file names for L2 products that have files for individual ABI bands
            if product == 'Cloud & Moisture Imagery' or product == 'Derived Motion Winds':
                if file[-42:-38] >= (start_hour + start_min) and file[-42:-38] <= (end_hour + end_min) and file[-62] == sector[-1]:
                    data.append(file)
                else:
                    continue
            else:
                # Extract file names for remaining L2 products
                if file[-42:-38] >= (start_hour + start_min) and file[-42:-38] <= (end_hour + end_min) and file[-59] == sector[-1]:
                    data.append(file)
                else:
                    continue
        else:
            # Extract file names for Full Disk and CONUS products
            if file[-42:-38] >= (start_hour + start_min) and file[-42:-38] <= (end_hour + end_min):
                data.append(file)
            else:
                continue

    return data

We create a function, called "get_abi_files( )", that first prints the AWS search results and the name of the directory where the files will be saved. We also print the sizes of the available files, because some ABI files can be large (10-50 MB). Then we ask the user if they want to download the files ("yes/no"). This allows the user to review the results of the search, as well as the destination directory, before initiating the download. If there are any problems with the search results, for example if the wrong satellite was selected by mistake, the user can answer "no" to terminate the script, and then adjust the search parameters in the GUI pull-down menus and re-run the script via the main function.

If the user enters "yes" to download the files, we access AWS anonymously again using the S3Fs library, and then we loop through the file names in the "data" list and use the S3Fs "get( )" command to copy (download) the corresponding files to the designated directory on the user's local computer/server.

We use the tqdm library to display a progress bar for the file download. It shows the percent complete of the total download, the number of files downloaded and the total number of files, and the total time elapsed/remaining in the download. Prior to the download loop, we flush the buffer for users running Python v3.8 or earlier, in order to avoid a glitch in the "tqdm" library.

# Print available ABI L2 data files that match user specifications, with option to download files
# "save_path": parameter variable assigned in main function

def get_abi_files(year, month, day, start_hour, start_min, end_hour, end_min, satellite, sector, product, save_path):

    # Query AWS ABI archive and print names/sizes of available L2 files
    data = aws_abi_list(year, month, day, start_hour, start_min, end_hour, end_min, satellite, sector, product)
    
    if len(data) > 0:
        # Access AWS using anonymous credentials
        aws = s3fs.S3FileSystem(anon=True)
        
        # Print list of available data files
        print('Available data files (approximate file size):')
        for file in data:
            file_size = aws.size(file)
            # sep='' removes extra spaces b/w print elements
            print(file.split('/')[-1], ' (', np.format_float_positional(np.float16(file_size/1.0E6), unique=False, precision=1), ' MB)', sep='')
        
        # Print directory where files will be saved
        print('\nData files will be saved to: ' + str(save_path))
        
        # Ask user if they want to download the available data files
        # If yes, download files to specified directory
        download_question = 'Would you like to download the ' + str(len(data)) + ' files?\nType "yes" or "no" and hit "Enter"\n'
        download_files = input(download_question)
        if download_files in ['yes', 'YES', 'Yes', 'y', 'Y']:
            
            # Display progress bar using tqdm library
            # Flush buffer if Python version < v3.9 to avoid glitch in tqdm library
            if parse(sys.version.split(' ')[0]) < parse('3.9'):
                sys.stdout.flush()
            else:
                pass
            for name in tqdm(data, unit='files', bar_format="{desc}Downloading:{percentage:3.0f}%|{bar}|{n_fmt}/{total_fmt} [{elapsed}<{remaining}]"):
                # Set save_path + file_name as pathlib.Path object and convert to string (for AWS)
                full_path = str(save_path / name.split('/')[-1])
                # Download file from AWS archive
                aws.get(name, full_path)
            print('\nDownload complete!')
        else:
            print('Files are not being downloaded.')
    else:
        print('No files retrieved. Check settings and try again.')

Main function: execute script

The main function executes the script by calling the "get_abi_files( )" function we created. Prior to that, we enter the directory where downloaded files will be saved ("save_path"). For simplicity, we set the directory as the current working directory, but this could be replaced by a user-entered directory path; we recommend using the pathlib module to set filesystem paths.

The parameter variables for the "get_abi_files( )" function include the AWS search parameters, entered in the widget menus. We obtain these variables by reading ".value" for each of the widget menu variables.

We also include a basic error check: in the event the user entered a scan sector for which the selected ABI Level 2 product is not generated, a notification message is printed and the script terminates.

# Execute search of AWS to find ABI L2 data files, with option to download files
# Get values from widget menus (AWS search parameters) using ".value"

# Main function
if __name__ == "__main__":
    
    # Set directory to save downloaded ABI files (as pathlib.Path object)
    # Use current working directory for simplicity
    save_path = Path.cwd()
 
    # Notify user if selected product is not generated for selected scan sector
    product_abbreviation = get_product_abbreviation(sector.value, product.value)
    if product_abbreviation == 'None':
        print('The selected product is not generated for the selected view sector. Try again.')
    else:
        # List/download available ABI L2 data files
        get_abi_files(year.value, month.value, day.value, shour.value, smin.value, ehour.value, emin.value, satellite.value, 
                  sector.value, product.value, save_path)

Example of output from search for GOES-16 ABI CONUS sector aerosol optical depth (AOD) data files on April 4, 2022 at 18:00-18:15 UTC:

Available data files (approximate file size):
OR_ABI-L2-AODC-M6_G16_s20220941801170_e20220941803543_c20220941806510.nc (6.5 MB)
OR_ABI-L2-AODC-M6_G16_s20220941806170_e20220941808543_c20220941811413.nc (6.5 MB)
OR_ABI-L2-AODC-M6_G16_s20220941811170_e20220941813543_c20220941816380.nc (6.5 MB)

Data files will be saved to: C:\Users\Trainings\Website
Would you like to download the data files?
Type "yes" or "no" and hit "Enter"
yes
Downloading:100%|█████████████████████████|3/3 [00:12<00:00]

Download complete!