Open AWS Glue services.
Create a database.
Name the database and click 'Create database'.
A database with the desired name is created.
Open IAM services and select 'Role'.
Select the desired roles.
Select the trusted entity, choose the use case as 'glue', and select 'Next'.
In Permissions, set the default execution role for the Glue function, and click 'Next'.
Name, review and create:
Add description (if needed/as required).
Click 'ETL jobs'.
Create a script by clicking the 'Script editor' tab.
Select 'Spark' from the options under Engine (from the drop-down).
Enable 'Upload script' and click 'Choose file'.
Locate the
script file on your device, select and open.
View or edit your script code (if required)
Add the basic properties under the 'Job details' tab.
Once the properties are added, click 'Create schedule'.
Add Schedule properties and click 'Create schedule.'
Click on the 'CRON expressions' link to learn more about CRON syntax for assigning a required schedule.
Click the 'Run' tab and then click 'Run job' to manually run the job; the database will be populated if the parquet files are already available in S3.
Note:
The database will be populated only if the parquet files are already available in S3; otherwise, there is no need to run the job manually.
The run screen then displays the active status of the job.
The job is successfully scheduled.
The database details, along with the logger number, are displayed.
Note:
Happens only when the job is run manually after creation it will have to wait until the first run.
Click on the logger serial number to get a detailed report.
Setting a Grafana Dashboard using the Athena plugin